Language differences in unexpected places

At the moment, I am mainly using two programming languages: C# during work and Java for my hobby project (http://strategicrps.com/..Yes, I am still working on it 😉 ).

Because of the many similarities between these two languages (at least in syntax), it is quite easy to switch from one to the other. However, it is also because of these similarities that it is sometimes easy to forget that they actually are different languages using different frameworks.

Recently, I came across one of these instances, and I thought it would be nice to share.

As an introduction, I want to ask you the following question: In the two code snippets below. What will be the values of “exists1” and “exsits2”?

C#

List<int> intList = new List<int>();
Dictionary<object, int> dictionary = new Dictionary<object, int>();

dictionary.Add(intList, 7);

bool exists1 = dictionary.ContainsKey(intList);

intList.Add(3);

bool exists2 = dictionary.ContainsKey(intList);

Java

List<Integer> intList = new ArrayList<Integer>();
HashMap<Object, Integer> hashMap = new HashMap<Object, Integer>();

hashMap.put(intList, 7);

boolean exists1 = hashMap.containsKey(intList);

intList.add(3);

boolean exists2 = hashMap.containsKey(intList);

Even though the snippets are very similar (ArrayList and HashMap are the Java equivalents to List and Dictionary in C#.NET), the answers actually differ. In the C# code, both booleans will be “true”, while in the Java code, the second value will be “false”.

Both HashMap and Dictionary first use the object hash method (hashCode() in Java and GetHashCode() in C#) for mapping, followed by the equals method (equals(Object) in Java and Equals(object) in C#) in case of collisions. This is where the difference occurs: whereas the ArrayList in java implements its own hash method based on its content, the List object in C# does not. As we change the contents of the ArrayList in the example above, the original hash value used as the key for the HashMap, is changed, and the list cannot be found anymore. In case of the List in C#, the hash implementation of the object class is being used, which is related to the memory location of the object.

As I have the most experience with C#, I did not expect the hash code of the ArrayList to change when changing the contents of the list. That being said, if I actually think about it, the Java implementation actually makes more sense to me; The hash code should change with the state of a class. My experience with the way C# works simply conditioned me not to give this a second thought.

Ironically, if you have a look at the source code of the object class in C#, it has the following comment:

// GetHashCode is intended to serve as a hash function for this object.
// Based on the contents of the object, the hash function will return a suitable
// value with a relatively random distribution over the various inputs.
//
// The default implementation returns the sync block index for this instance.
// Calling it on the same object multiple times will return the same value, so
// it will technically meet the needs of a hash function, but it's less than ideal.
// Objects (& especially value classes) should override this method.

If I were to interpret this comment, I would actually expect that a class like List should have overwritten the GetHashCode method, just like Java does.

Before wrapping up, I feel like I have to add that I do realize you should never use mutable objects as keys, nor in C# or in Java. The case I came across was very different from this one, and I merely created this case as an example.

Even though this is probably not something you will ever encounter (if you do, there is a big chance that something else is wrong with your implementation), I still thought it was an interesting subject to write about. It is not only about the underlying technicals. It is also a reminder of the things we can encounter when we start to think we understand it all.

Whatever your personal take is on this subject, I hope it was at least an interesting read.