Converting array string to string and back in Java
Solution 1
If you don't wanna spend so much time with string operations you could use java serialization + commons codecs like this:
public void stringArrayTest() throws IOException, ClassNotFoundException, DecoderException {
String[] strs = new String[] {"test 1", "test 2", "test 3"};
System.out.println(Arrays.toString(strs));
// serialize
ByteArrayOutputStream out = new ByteArrayOutputStream();
new ObjectOutputStream(out).writeObject(strs);
// your string
String yourString = new String(Hex.encodeHex(out.toByteArray()));
System.out.println(yourString);
// deserialize
ByteArrayInputStream in = new ByteArrayInputStream(Hex.decodeHex(yourString.toCharArray()));
System.out.println(Arrays.toString((String[]) new ObjectInputStream(in).readObject()));
}
This will return the following output:
[test 1, test 2, test 3]
aced0005757200135b4c6a6176612e6c616e672e537472696e673badd256e7e91d7b47020000787000000003740006746573742031740006746573742032740006746573742033
[test 1, test 2, test 3]
If you are using maven, you can use the following dependency for commons codec:
<dependency>
<groupId>commons-codec</groupId>
<artifactId>commons-codec</artifactId>
<version>1.2</version>
</dependency>
As suggested with base64 (two lines change):
String yourString = new String(Base64.encodeBase64(out.toByteArray()));
ByteArrayInputStream in = new ByteArrayInputStream(Base64.decodeBase64(yourString.getBytes()));
In case of Base64 the result string is shorter, for the code exposed below:
[test 1, test 2, test 3]
rO0ABXVyABNbTGphdmEubGFuZy5TdHJpbmc7rdJW5+kde0cCAAB4cAAAAAN0AAZ0ZXN0IDF0AAZ0ZXN0IDJ0AAZ0ZXN0IDM=
[test 1, test 2, test 3]
Regarding the times for each approach, I perform 10^5 executions of each method and the result was as follows:
- String manipulation: 156 ms
- Hex: 376 ms
- Base64: 379 ms
Code used for test:
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.ObjectOutputStream;
import java.util.StringTokenizer;
import org.apache.commons.codec.DecoderException;
import org.apache.commons.codec.binary.Base64;
import org.apache.commons.codec.binary.Hex;
public class StringArrayRepresentationTest {
public static void main(String[] args) throws IOException, ClassNotFoundException, DecoderException {
String[] strs = new String[] {"test 1", "test 2", "test 3"};
long t = System.currentTimeMillis();
for (int i =0; i < 100000;i++) {
stringManipulation(strs);
}
System.out.println("String manipulation: " + (System.currentTimeMillis() - t));
t = System.currentTimeMillis();
for (int i =0; i < 100000;i++) {
testHex(strs);
}
System.out.println("Hex: " + (System.currentTimeMillis() - t));
t = System.currentTimeMillis();
for (int i =0; i < 100000;i++) {
testBase64(strs);
}
System.out.println("Base64: " + (System.currentTimeMillis() - t));
}
public static void stringManipulation(String[] strs) {
String result = serialize(strs);
unserialize(result);
}
private static String[] unserialize(String result) {
int sizesSplitPoint = result.toString().lastIndexOf('$');
String sizes = result.substring(sizesSplitPoint+1);
StringTokenizer st = new StringTokenizer(sizes, ";");
String[] resultArray = new String[st.countTokens()];
int i = 0;
int lastPosition = 0;
while (st.hasMoreTokens()) {
String stringLengthStr = st.nextToken();
int stringLength = Integer.parseInt(stringLengthStr);
resultArray[i++] = result.substring(lastPosition, lastPosition + stringLength);
lastPosition += stringLength;
}
return resultArray;
}
private static String serialize(String[] strs) {
StringBuilder sizes = new StringBuilder("$");
StringBuilder result = new StringBuilder();
for (String str : strs) {
if (sizes.length() != 1) {
sizes.append(';');
}
sizes.append(str.length());
result.append(str);
}
result.append(sizes.toString());
return result.toString();
}
public static void testBase64(String[] strs) throws IOException, ClassNotFoundException, DecoderException {
// serialize
ByteArrayOutputStream out = new ByteArrayOutputStream();
new ObjectOutputStream(out).writeObject(strs);
// your string
String yourString = new String(Base64.encodeBase64(out.toByteArray()));
// deserialize
ByteArrayInputStream in = new ByteArrayInputStream(Base64.decodeBase64(yourString.getBytes()));
}
public static void testHex(String[] strs) throws IOException, ClassNotFoundException, DecoderException {
// serialize
ByteArrayOutputStream out = new ByteArrayOutputStream();
new ObjectOutputStream(out).writeObject(strs);
// your string
String yourString = new String(Hex.encodeHex(out.toByteArray()));
// deserialize
ByteArrayInputStream in = new ByteArrayInputStream(Hex.decodeHex(yourString.toCharArray()));
}
}
Solution 2
Use a Json parser like Jackson to serialize/deserialize other type of objects as well like integer/floats ext to strings and back.
Janek
Updated on July 19, 2022Comments
-
Janek almost 2 years
I have an array String[] in Java, and must first encode/convert it into a String and then further in the code covert it back to the String[] array. The thing is that I can have any character in a string in String[] array so I must be very careful when encoding. And all the information necessary to decode it must be in the final string. I can not return a string and some other information in an extra variable.
My algorithm I have devised so far is to:
Append all the strings next to each other, for example like this: String[] a = {"lala", "exe", "a"} into String b = "lalaexea"
Append at the end of the string the lengths of all the strings from String[], separated from the main text by $ sign and then each length separated by a comma, so:
b = "lalaexea$4,3,1"
Then when converting it back, I would first read the lengths from behind and then based on them, the real strings.
But maybe there is an easier way?
Cheers!
-
sp00m over 11 yearsthe OP explained that he can have any character in a string in String[] array, so you should escape the chosen separator before joining, e.g.
s.replaceAll("\\$", "\\\\\\$");
. -
Luiggi Mendoza over 11 years@sp00m I would prefer to main the data unchanged, instead propose a new pattern to separate each
String
(and it's regex to split it back). -
Janek over 11 yearsbut it does not solve the problem, still it can happen that this pattern will be in one of the string in String[]. An idea would be to draw always the pattern but then still there is a possibility and it does not seem to be very clean solution.
-
ARRG over 11 yearsThis is a safer method than those proposed. The overhead is larger though, using another encoding than hex such as base64 would be a good idea.
-
Francisco Spaeth over 11 years@ARRG: thanks for your comment, I just commented the changes needed to use base64
-
Janek over 11 yearsAnd how is the performance of this two solutions (string manipulation vs proposed in this answer)?
-
Francisco Spaeth over 11 yearsnot safe et all, since this char sequence can be present on the string itself
-
dounyy over 11 yearsWell, I tend to agree with you. But you still can use chars that are forbidden somewhere else in your application, such as any char forbidden in databases for example. Of course @ and # were examples...
-
mvreijn over 7 yearsDeflating using Base64 (best option IMHO) takes about 17ms on my machine for an Integer array of 5. Inflating takes just 1ms.