Some Education While Feeding a Code Troll

It doesn’t happen often that i feel inclined to comment on something that is not part of my daily work life. Today, I’ll make an exception because a decent answer does not fit into 140 chars.

It all started out with this post on Twitter that someone in my timeline retweeted:

Note that I’m not a Java developer (but write a lot of C++), and I’m not going to defend the language. However, I think this post delibrately deceives people instead of educating them. This is what this post tries to compensate for.

To make things a bit more interesting, I’d actually like to start with the last example, going up in the reverse order:

int q = 022 - 2;
> 16 // because fuck math

This is simply a matter of notation. Any C-style language I am aware of (and Java clearly uses C traits in many ways), uses the following conventions for integer literals:

aa // 10-based
0aa // 8-based (octal)
0xaa // 16-based (hexadecimal)

where ‘aa’ represents a number. You can complain and moan about it, or simply accept that even today many people still use that kind of notation, at least the hexadecimal representation. In case of Java, it would be simple enough to make the IDE or Linter warn about the octal notation, should studies show that it does more harm than good. Anyway, a well accepted idiom among developers of C-style languages, and “Fuck math” is a clever deception to trick you into thinking it was a compiler bug.

Next, we see this:

(byte) + (char) - (int) + (long) -1;
> 1 // I'm not sure what this even means

Squeezing language features into one line in an unusual way and then complain about it is possible in any language. The real learning value lies in the answer to two questions:

1. What are we looking at (i.e. what does it mean)?

We are looking at a number of casts. In C-style languages, a cast is a way to tell the compiler that I want to convert a value. Some casts have to be explicit, for some it’s optional. In this case, it’s pure mind-fuckery.

The way your mind needs to parse this is like the compiler would do: from right to left.

2. Why is the result 1?

We start with -1, which would be an integer in Java. We cast it to a long integer, but the value is still 1. Effectively, the compiler is now looking at this:

(byte) + (char) - (int) + (-1);

So we simply cast this back to a normal integer:

(byte) + (char) - (-1);

This is where first-grade math kicks in and we get

(byte) + (char) + 1;

The next two casts simply convert it to char and to byte. So after two more casts, we end with this:

1;

For the simple example of 1, the data type does not matter. Even a boolean could hold it. If you are interested into what kind of numbers integer, long, char and byte can hold, check Oracles’ data type reference.

Which brings us to closer to the really interesting case, by looking at the second example:

Integer.valueOf(1000) == Integer.valueOf(1000)
> false // WTF?

This is caused by a decision that I cannot really comprehend: Sun decided to introduce an Integer class that wraps ints to support Generics, instead of adding support for non-class data types (sometimes called Plain Old Data types, or PODs), like C++ does:

std::list<int> myIntList;

Instead, in Java you have to use

List<Integer> myIntList;

Usually, that’s not a problem, as ints get transparently wrapped into Integers (appropriately called ‘boxing’), so you shouldn’t have any dealing with the Integer class outside Generics.

But even if you choose to be ignorant about that, or simply don’t know better, the == operator should sound every last siren in your brain, because in Java, there is no operator overloading.

This means that whenever you compare two Objects (Integer.valueOf() will return an object of type Integer), it will always compare the addresses of the objects, rather than any of the values that it might hold. Of course, addresses of two distinct objects are not the same, even if their value is. The correct way would be to either use straight int‘s, or use the compare() method, which is recommended for comparing value type objects in Java.

So why on $deity‘s earth is this happening?

Integer.valueOf(6) == Integer.valueOf(6)
> true // Of course

If you’ve paid attention so far, you will realize that the comment is again deceptive. It’s the only actual WTF here, should you choose to not have read this post for the occasional chuckle.

We just learned that Java’s == operator compares object addresses. I was briefly puzzled by this, but then theorized that the only logical solution is that the implementation of valueOf(int) must have some kind of cache. And indeed, a look at the Java source code reveals that the real solution is indeed close.

Java provides a cache for number Objects with values from -128 to +127. This means that the addresses for objects created from integers within this range will always result in true.

Again, your Linter or IDE should warn about both of the above.

Conclusion

I hope I was able to show that sometimes, there is more to seemingly weird code snippets than meets the eye. It’s usually worthwhile to lean back to try and understand some of them, and then re-evaluate your assessment on whether or not you like a given language. So again, this is not in defense of Java. Also, a tip of the hat to the author of the code for identifying these interesting language quirks.

1 Comment on “Some Education While Feeding a Code Troll