2014-02-20

Hashtable vs ConcurrentDictionary

Historically I've seen the Hashtable be favored over ConcurrentDictionary with the assumption that is was more efficient allowing for lock free reads. Well they both allow lock free reads so which one is really the better option?
First some basics. The Hashtable requires both key and value to be of type object. This means that if you use value types they will be boxed. It also means you need to cast your values to the correct type when you get them in order to actually do something useful. The ConcurrentDictionary however is a generic type so no boxing nor casting. Advantage ConcurrentDictionary!

When it comes to reading, both collections allow for lock free reading which should be very effective. Writing however is interesting. While the ConcurrentDictionary guarantees that writes are atomic the Hashtable is only thread safe if written to by a single thread at a time. This means you either need to write your code so that only one thread ever writes to a Hashtable or you need to add your own lock around it. Not having to deal with this is naturally a better option in my opinion so: Advantage ConcurrentDictionary!

Sometimes you need to enumerate all values or keys in your dictionary. The ConcurrentDictionary allows you to enumerate the collection safely but if the collection is modified during the enumeration you may or may not get the new values in your enumeration. The Hashtable however throws an exception if modified during an enumeration. The easy fix for that is to use a read/write lock for your Hashtable and acquire the read lock during enumerations and the write lock for modification. Since I don't like the undefined behavior of the ConcurrentDictionary it is: Advantage Hashtable!

Performance is always important. I ran two tests for each of the collections when writing this. First I just had five threads read data from each collection and then I inserted a large number of values while letting five threads read data at the same time. When it came to only doing reads, in my test the Hashtable was slightly faster. Relativity speaking the Hashtable allowed me to do almost 10% more reads per second than the ConcurrentDictionary. When the writes were introduced however I got some interesting result. The Hashtable could still do 10% more reads per second but it took twice the time to insert the same number of elements. Since I think a combination of reads and writes are what you expect when using these collections I have to say: Advantage ConcurrentDictionary!

So all in all I must say the ConcurrentDictionary is the winner since it means your code will be nicer and you also seem to get better performance out of it if you have several updates per second. I would only consider the Hashtable if I need to enumerate the collection a lot or if the ConcurrentDictionary turns out to have insufficient performance for the specific scenario I'm working on.

2 comments:

  1. ahem Hashtable.Synchronized()

    ReplyDelete
    Replies
    1. True, but wrapping the Hashtable still does not address important things like generics, and performance.

      Delete