Measure your assumptions, and then measure again

Happy New Year! I planned to have a blog post finished before 2013 ended but this didn't work out. Hopefully this one compensates that a bit.

A little while ago I began preparing another post in my series of Java Tips. Well, this posting is still due but for now I can present you another one where I give a short look on some investigation the preparation of this Java tip required.

It originated in a couple of Sonar warnings on one of our development systems on my job. Specifically the warning about the inefficient way of converting a number to String spiked my interest and I began evaluating the places where these warnings showed up. The warning itself told that there were conversions of numbers to Strings in the form of

String s = new Integer(intval).toString();

which should be better expressed from a performance point of view as

String s = Integer.toString(intval);

Some time during this investigation I became curious if there is really a large effect between the different solutions and wrote a short test case which measured the time of different ways of converting an Integer to a String. Interestingly it did not exactly turn out as clear as I hoped. Initially it seemed that the error claim did not match the reality as my testcase did not show a clear performance advantage of any specific solution. Most ways of converting values to strings were pretty close to each other and the results varied with each execution a bit so I considered this to be mainly a measuring inaccuracy.

// Results with 100000 invocations of each method, timed using System.nanoTime()
new Integer(int).toString()      : 7075890
String.valueOf(int)              : 6570317
Integer.valueOf(int).toString()  : 9597342
Integer.toString(int)            : 11398929

With those combined results I could have been satisfied and discard the Sonar warning as just some hint to write more readable code. But something didn't feel right and after some time I had another look at this. Especially the last measurement looked quite odd. What was the difference between a static call (line 4) and a method call on a new object (line 1) so that the results are about 25% apart. And if so, why is the static call the slower one? Using the Java Decompiler Eclipse plugin I inspected the underlying source. I could have used the the source search engines available elsewhere (e.g. GrepCode) but that was just too cumbersome for a quick investigation and not safe to reflect the actual (byte-)code used. The decompiled source of java.lang.Integer made the observed difference even more mysterious.

...
public String toString()
{
  return toString(this.value);
}
...
public static String toString(int paramInt)
{
  if (paramInt == -2147483648)
    return "-2147483648";
  int i = (paramInt < 0) ? stringSize(-paramInt) + 1 : stringSize(paramInt);
  char[] arrayOfChar = new char[i];
  getChars(paramInt, i, arrayOfChar);
  return new String(arrayOfChar, true);
}
...

Ok, so the call on toString() on a new Integer object is in turn jumping to the static toString(int) method? In that case my previous measurements must have been influenced by something else than just the sourcecode itself. Most probably some effects from JIT compilation by the HotSpot JVM. And therefore the measurements were not usable until I made the effect of JIT compilation neglectable. What's the easy way of making JIT compilation a neglectable factor in measurements? Crank up the number of iterations of the code to measure. The 100k iterations in my first test ran in a mere seconds, maybe not long enough to properly warm up the JVM. But after increasing the iteration count by a factor of 1000 (taking significantly longer to run) following numbers were the result:

// Results with 100000000 invocations of each method, timed using System.nanoTime()
new Integer(int).toString()      : 6044546500
String.valueOf(int)              : 6052663051
Integer.valueOf(int).toString()  : 6287452752
Integer.toString(int)            : 5439002900

Raising the number of iterations even more did not change the relative difference between the numbers much so I think at that stage JVM and other side effects are small enough to not change the execution speed significantly. It also better fits to my expectations of the running costs when roughly brought into correlation of the execution path, logic executed and functions called of each individual conversion variant.

At that point I'm pretty confident that I now better understand the reasons for the initial Sonar warning and that the static method call is roughly 10% faster than calling toString() on newly created objects. And that's even without burdening an additional Integer object to the garbage collection which at a later stage takes some additional time to clean up.

Of course, I'm aware that this is a topic which dives deep into the micro-optimization area. On the other side it can be boiled down pretty easily to using certain methods in favour of other methods, as there is no drawback on safety or functionality and at the core the same static method will be called anyway. Furthermore, ignoring simple performance and memory optimizations may not affect you on small scale like client applications or Java applets but may be pretty relevant on larger (server) systems or even time-critical installations. For a more detailled description why this is relevant on server systems and how it affects throughput and garbage collection times see this excellent list of Java Anti-Patterns.

|

Similar entries