January 12, 2014

Small blog update - now with safe social network integration

A few days ago I have been told again that my blog lacks the 'Like' button. Since that has not been the first time that I received this request and I had a few spare hours yesterday I decided to give it a go.

But I didn't want to include the social networking buttons of the most common networks without further care. From other webpages I know that some 'Like' buttons add significantly to the load and display time of websites if they are not very multimedia intense. I consider my blog to exactly fall into that category and therefore I want to avoid to double page loading times just for adding some tiny icons. Furthermore, in the mist of the Snowden/NSA revealations, I do not want to the visits of my postings to be automatically tracked by a multitude of different companies all over the internet.

Granted, I use Google Analytics for tracking the visits to my blog and individual posts to find out which area draws the most interest and how much traffic in general is receiving at my blog, but I set the "anonymizeIp"-parameter in my tracking to disallow the storage of detailled visitor IP addresses for Analytics processing. Yeah, I know it's not 100% anonymous and you still have to trust Google to respect this setting independend of their promises, but for me that's the acceptable balance between cost and benefit.

Back to the social network integration. To value the visitors experience and proactively counter the NSAs tracking abilities I decided to use a 2-staged approach in my blog. This means to Like/Tweet/+1 one of my postings or see the number of tweets/+1's (doesn't currently work for FB) one has to "activate" the specific button with a single click in advance. If this activation is not performed, no data or request is sent to the respecting server/company.

In my blog I use the solution of the Heise publisher, 2 Klicks für mehr Datenschutz, which they provide free for usage on their project page. It took me some time to integrate it on my blog, mainly because the sourcecode which they provide is not compatible to recent versions of jQuery and already a bit out of sync with integration changes by the social network buttons. The version used on http://heise.de is more up to date but not yet reflected on its project pageI wrote a notice about that to the writers of the plugin but have only received an automatic reply so far. We'll see...

Nevertheless, the integration on my blog is finished so far and some other Javascript code has also received a small overhaul. The loading time of my pages shouldn't be too much affected, there is only a small visual inconsistency left. If you don't notice it, don't bother. Maybe I'll manage to fix it, otherwise not much harm is done, at least in my opinion.

January 9, 2014

Java Tip #12: Use Collection min()/max() when looking for lowest/highest item in collections

At work I regularly stumble across a specific type of processing: sorting a collection/list and afterwards only retrieving the first or last element. Such code constructs are clearly for fetching the minimum or maximum element but sorting the whole collection for just a single item seems to be a bit of overhead. And indeed it is and the Collection class provides methods for these purposes without having to juggle items in memory and creating new collection objects which are never really required.


Advice

If you have a collection and want to retrieve the minimum or maximum element from it you don't have to sort it and retrieve the first or last but java.util.Collections offers its min() and max() methods for that which are much more efficient for that exact purpose. These can also be used with custom Comparators.

Code-Example

Before

// retrieve smallest element
Collections.sort(elementList);
Element smallest = elementList.get(0);    
...
// fetch item with highest custom value
Collections.sort(customItemsList, Collections.reverseOrder(new MyCustomComparator()));
Item largestItem = customItemsList.get(0);

After

Element smallest = Collections.min(elementList);    
...    
Item largestItem = Collections.max(customItemsList, new MyCustomComparator());

Benefit

Readability gain. Possible huge performance gain, as min()/max() run in O(n) while sorting takes O(n log(n)).

Remarks

If the comparison methods (compare(), equals()) do not fulfill the Comparator requirements according to the Java SDK (see here) (i.e. A.equals(B) != B.equals(A) or (A<B)!=(B>A), etc.) then the result of min()/max() may be different than the result from the approach using sorted collections.

January 3, 2014

Measure your assumptions, and then measure again

Happy New Year! I planned to have a blog post finished before 2013 ended but this didn't work out. Hopefully this one compensates that a bit.

A little while ago I began preparing another post in my series of Java Tips. Well, this posting is still due but for now I can present you another one where I give a short look on some investigation the preparation of this Java tip required.

It originated in a couple of Sonar warnings on one of our development systems on my job. Specifically the warning about the inefficient way of converting a number to String spiked my interest and I began evaluating the places where these warnings showed up. The warning itself told that there were conversions of numbers to Strings in the form of

String s = new Integer(intval).toString();

which should be better expressed from a performance point of view as

String s = Integer.toString(intval);

Some time during this investigation I became curious if there is really a large effect between the different solutions and wrote a short test case which measured the time of different ways of converting an Integer to a String. Interestingly it did not exactly turn out as clear as I hoped. Initially it seemed that the error claim did not match the reality as my testcase did not show a clear performance advantage of any specific solution. Most ways of converting values to strings were pretty close to each other and the results varied with each execution a bit so I considered this to be mainly a measuring inaccuracy.

// Results with 100000 invocations of each method, timed using System.nanoTime()
new Integer(int).toString()      : 7075890
String.valueOf(int)              : 6570317
Integer.valueOf(int).toString()  : 9597342
Integer.toString(int)            : 11398929

With those combined results I could have been satisfied and discard the Sonar warning as just some hint to write more readable code. But something didn't feel right and after some time I had another look at this. Especially the last measurement looked quite odd. What was the difference between a static call (line 4) and a method call on a new object (line 1) so that the results are about 25% apart. And if so, why is the static call the slower one? Using the Java Decompiler Eclipse plugin I inspected the underlying source. I could have used the the source search engines available elsewhere (e.g. GrepCode) but that was just too cumbersome for a quick investigation and not safe to reflect the actual (byte-)code used. The decompiled source of java.lang.Integer made the observed difference even more mysterious.

...
public String toString()
{
  return toString(this.value);
}
...
public static String toString(int paramInt)
{
  if (paramInt == -2147483648)
    return "-2147483648";
  int i = (paramInt < 0) ? stringSize(-paramInt) + 1 : stringSize(paramInt);
  char[] arrayOfChar = new char[i];
  getChars(paramInt, i, arrayOfChar);
  return new String(arrayOfChar, true);
}
...

Ok, so the call on toString() on a new Integer object is in turn jumping to the static toString(int) method? In that case my previous measurements must have been influenced by something else than just the sourcecode itself. Most probably some effects from JIT compilation by the HotSpot JVM. And therefore the measurements were not usable until I made the effect of JIT compilation neglectable. What's the easy way of making JIT compilation a neglectable factor in measurements? Crank up the number of iterations of the code to measure. The 100k iterations in my first test ran in a mere seconds, maybe not long enough to properly warm up the JVM. But after increasing the iteration count by a factor of 1000 (taking significantly longer to run) following numbers were the result:

// Results with 100000000 invocations of each method, timed using System.nanoTime()
new Integer(int).toString()      : 6044546500
String.valueOf(int)              : 6052663051
Integer.valueOf(int).toString()  : 6287452752
Integer.toString(int)            : 5439002900

Raising the number of iterations even more did not change the relative difference between the numbers much so I think at that stage JVM and other side effects are small enough to not change the execution speed significantly. It also better fits to my expectations of the running costs when roughly brought into correlation of the execution path, logic executed and functions called of each individual conversion variant.

At that point I'm pretty confident that I now better understand the reasons for the initial Sonar warning and that the static method call is roughly 10% faster than calling toString() on newly created objects. And that's even without burdening an additional Integer object to the garbage collection which at a later stage takes some additional time to clean up.

Of course, I'm aware that this is a topic which dives deep into the micro-optimization area. On the other side it can be boiled down pretty easily to using certain methods in favour of other methods, as there is no drawback on safety or functionality and at the core the same static method will be called anyway. Furthermore, ignoring simple performance and memory optimizations may not affect you on small scale like client applications or Java applets but may be pretty relevant on larger (server) systems or even time-critical installations. For a more detailled description why this is relevant on server systems and how it affects throughput and garbage collection times see this excellent list of Java Anti-Patterns.