Java Tip #2: Use StringBuilder for String concatenation
And on to the next in a series of Java Tips. While last time the performance gain was very tiny and only affected high-load scenarios, this tip may considerably speed up even simple applications where there is String-processing involved.
Advice
Use StringBuilder instead of + or += to concatenate Strings in non-linear cases (when concatenation not appears immediately one after another), ie. in loops.
Code-Example
Before
String test = "";
for(int i = 0; i < 50000; i++) {
test += "abc";
}
After
StringBuilder test = new StringBuilder();
for(int i = 0; i < 50000; i++) {
test.append("abc");
}
Benefit
Huge performance gain! The unoptimized code forces Java to create new Strings and copy the contents around all the time (because Strings are immutable in Java). The optimized code avoids this creation/copying by using StringBuilder. While the Java compiler can optimize linear concatenations, ie.
String test = "a" + "b" + "c";
into using a StringBuilder internally, it is not clever enough to apply this optimization correctly if there is more logic than just concatenation involved. Even if the concatenation is the only operation inside some loop-logic, the Java compiler falls back into creating a whole lot of String (or to be more exact: StringBuilder) objects, copying them around like crazy and causing a huge performance impact in some cases. I confirmed this example for measurement of StringBuilder performance: concatenating 50000 times "abc" in a loop takes ~11000ms, using StringBuilder keeps the time spent at near 0ms. (I tried concatenating with 500000 iterations, but stopped the execution of the first loop after ~10 minutes running unfinished, so the impact is nonlinear. Second loop finished in 15ms with 500000, for the record.)
Remarks
It is not necessary to optimize code like
String email = user + "@" + domain + ".com";
as javac is clever enough for these cases and it would reduce readability considerably. But even with the simplest loops involved the conditions change. For example, internally the first loop gets compiled by javac into following bytecode
13 new java.lang.StringBuilder [24]
16 dup
17 aload_1 [test]
18 invokestatic java.lang.String.valueOf(java.lang.Object) : java.lang.String [26]
21 invokespecial java.lang.StringBuilder(java.lang.String) [32]
24 ldc <String "abc"> [35]
26 invokevirtual java.lang.StringBuilder.append(java.lang.String) : java.lang.StringBuilder [37]
29 invokevirtual java.lang.StringBuilder.toString() : java.lang.String [41]
32 astore_1 [test]
33 iinc 4 1 [i]
36 iload 4 [i]
38 ldc <Integer 50000> [45]
40 if_icmplt 13
where for each iteration of the loop(13-40) a new StringBuilder is instantiated and initialized with the result of the previous' loop StringBuilders value before appending the constant String just once each time. The optimized code results in this bytecode
81 new java.lang.StringBuilder [24]
84 dup
85 invokespecial java.lang.StringBuilder() [62]
88 astore 4 [builder]
90 iconst_0
91 istore 5 [i]
93 goto 107
96 aload 4 [builder]
98 ldc <String "abc"> [35]
100 invokevirtual java.lang.StringBuilder.append(java.lang.String) : java.lang.StringBuilder [37]
103 pop
104 iinc 5 1 [i]
107 iload 5 [i]
109 ldc <Integer 50000> [45]
111 if_icmplt 96
in which the loop(96-111) just appends the constant String to the same StringBuilder each time which was created only once before the loop started. No copying around and creation of additional objects necessary, thus a huge performance-gain.