I need a fast replacement for the System.Collections.Generic.Dictionary
. My application should be really fast. So, the repl
Chances are you're seeing JIT compilation. On my box, I see:
00:00:00.0000360
00:00:00.0000060
when I run it twice in quick succession within the same process - and not in the debugger. (Make sure you're not running it in the debugger, or it's a pointless test.)
Now, measuring any time that tiny is generally a bad idea. You'd need to iterate millions of times to get a better idea of how long it's taking.
Do you have good reason to believe it's actually slowing down your code - or are you basing it all on your original timing?
I doubt that you'll find anything significantly faster than Dictionary<TKey, TValue>
and I'd be very surprised to find that it's the bottleneck.
EDIT: I've just benchmarked adding a million elements to a Dictionary<TKey, TValue>
where all the keys were existing objects (strings in an array), reusing the same value (as it's irrelevant) and specifying a capacity of a million on construction - and it took about 0.15s on my two-year-old laptop.
Is that really likely to be a bottleneck for you, given that you've already said you're using some "old slow libraries" elsewhere in your app? Bear in mind that the slower those other libraries are, the less impact an improved collection class will have. If the dictionary changes are only accounting for 1% of your overall application time, then even if we could provide an instantaneous dictionary, you'd only speed up your app by 1%.
As ever, get a profiler - it'll give you a much better idea of where your time is going.
USE INTS AS KEYS FOR MAXIMUM PERFORMANCE:
For anyone who came here from Google, if you want to squeeze every last bit of performance out of a Dictionary, then use Ints as keys. Here's a benchmark comparing Int vs String Keys: https://jacksondunstan.com/articles/2527
The author of the article even mentions that converting strings to ints is worthwhile if you have such a need.
Also, note that this same behavior occurs in some other languages like PHP. Php associative arrays -are in fact- dictionaries, and if you use Ints in ascending order in PHP7, they outperform string keys tremendously.
Beside all of the above said please to note following as well:
Dictionary<string, string> dictionary = new Dictionary<string, string>( 301 );
Depending on what you need to be faster add
or get
, you may also find important to focus on optimization towards Add/Remove
or just Retrieve
. Meaning that, sometimes it is required to locate and retrieve faster rather than add or remove them. In your case you mentioned in example dictionary.Add
method, but the question was as well asked for faster replacement in general for whole class Dictionary<TKey, TValue>
So i assume, you are interested not only in add
method, but as well the get
method to be faster. In that case next bullet may be considered as a faster solution in specific patterns of Key data.
Faster then Dictionary
and SortedList(int)
can only be just pure Static/Dynamic Generic type of Array Array<String>
... but it is a trade-off of BIG O(N): time / space.
Explaining:
a.1) Dictionary
can get
values in O(1) (if there are no many collision of hash values!)
a.2) Dictionary
add
is sometimes O(1) and sometimes O(n). so if you add one item after another, then roughly for every next element index equal to next prime number you would receive a time complexity of O(n) which is bigger just 0(1). Source: Understanding Generic Dictionary in-depth
b.1) Array
Element is accessed simply by int
index value in pre-allocated memory segment...
Array[Index]
( Time Complexity = O(1) ).
hence it is always faster than following operations in case of dictionary
: LoopSearchInEntryListTargetElement(TransformToBucketArrayIndex(GetHashCode()))
Entry List may be iterated from 1 to 100 cycles in case of collisions.
b.2) Setting a Value to Array
is as well just an int
type value assignment operations in memory ( Time Complexity O(1) ).
in case of Dictionary
this would sometimes require resize and/or reorganize.
In your case: if you know that all distinct values of key string are not more then some uint.MaxValue
(Unsigned 32-bit integer) (in 32 bit environment) and Maximum length of String of Any Key is NOT MORE then 4 (assuming that charset is from char(0) to char(255) ) --> You could easily transform any of that type of String to corresponding int
value (used as an Index in our Array<string>
) for Writing or Reading a String
value fastest way possible.
That would always be O(1) time complexity for both Getting and/or Assigning a Value in Array. (Contains(TKey)
could be written as TKeyValueArray[index] != NULL
!Note: if TValues can be null as well in your scenario, then create a custom class or generic type of structure similar to KeyValuePair but with additional boolean
field - Flag Set or NotSet)
Rough Example (hint): take byte code and do simple math for each char byte code from string index [0, 1, 2, 3]
(
index =
SomeKeyString [ 0 ] * 256 * 256 * 256
+ SomeKeyString [ 1 ] * 256 * 256
+ SomeKeyString [ 2 ] * 256
+ SomeKeyString [ 3 ]
)
the formula and approach can be optimized per case (if strings have only Latin 1 Alphabet Characters then there is no need to use as much memory or you can have more lengthy TKey
strings represented in your array).
this is in case of desperate need of performance.
*Latin 1 Alphabet uses 191 characters ISO 8859-1 encodes what it refers to as "Latin alphabet no. 1", consisting of 191 characters from the Latin script... *
Sorry for just providing not thoroughly explained hints, I will try to provide more detailed answer in case of interest.
Also please read this Initial capacity of collection types, e.g. Dictionary, List
If you really need better performance, you're going to have to give up something major - like generics, dynamic memory allocation, etc. All those features sacrifice some performance.
I would avoid using Contains if at all possible and look at TryGetValue etc.
Dictionaries allow a specified IEqualityComparer comparer. for strings, or other types of A generic compare may not be the best performing. A little ILSpy will show you that it, if take the default == comparer, if your implementation suffers performance you can inject your own IEqualityComparer compairer. In the end the dictionary will compare hash code of what you provide as a key with the existing hash codes in it’s list of entries.
So if you have specific needs dictionary, perhaps specialize it in FastDictionary class getting to the hascode in a more efficient way,
In your implementation that would be:
var dictionary = new Dictionary<string, string>(StringComparer.Ordinal);
Could you use a List and define an enum such that, for example, fieldName = 0, Title = 1 and use each propery's unique index as a lookup index into the list? That would be the fastest solution, though the least flexible since you'd be tied to an enum.