Does a useful Haskell HashMap/HashTable/Dictionary library exist?

送分小仙女□ 提交于 2019-12-09 02:51:51

问题


I'm looking for a monad-free, constant access query O(1) associative array.

Consider the hypothetical type:

data HT k v = ???

I want to construct an immutable structure once:

fromList :: Foldable t, Hashable k => t (k,v) -> HT k v

I want to subsequently query it repeatedly with constant time access::

lookup :: Hashable k => HT k v -> k -> Maybe v

There appears to be two candidate libraries which fall short:

  • unordered-containers

  • hashtables

unordered-containers

unordered-containers contains both strict and lazy variants of the type HashMap. Both HashMaps have O(log n) queries as documented by the lookup function. This query access time appears to be due to the construction of the HashMap types, which have an internal tree structure allowing for O(log n) insert functionality. An understandable design trade off for many use-cases, but since I don't need a mutable HashMap this tradeoff hampers my use-case.

hashtables

hashtables contains a HashTable type-class and three instance types with varying table constructions strategies. This library's type-class defines a constant time O(1) lookup function definition, but it is eternally embedded in the ST monad. There is no way to "freeze" the stateful HashTable implementations and have a lookup function that is not embedded of a stateful monad. The library's type-class interface is well designed when the entire computation is wrapped in a state monad, but this design is unsuitable for my use-case.


Does there exist some other library which defines types and functions which can construct an immutable constant access query O(1) associative array that is not embedded in a stateful monad?

Does there exist some way to wrap or modify these existing hashing-based libraries to produce an immutable constant access query O(1) associative array that is not embedded in a stateful monad?


回答1:


The library you want is… unordered-containers. Or just plain old Data.Map from containers, if you’d prefer.

The note in the unordered-containers documentation explains why you shouldn’t worry about the O(log n) time complexity for lookups:

Many operations have a average-case complexity of O(log n). The implementation uses a large base (i.e. 16) so in practice these operations are constant time.

This is a common practice with certain kinds of functional data structures because it allows good sharing properties while also having good time complexities. log16 still produces very small numbers even for very large n, so you can almost always treat those complexities as “effectively constant time”.

If this is ever a bottleneck for your application, sure, go with something else, but I find that highly unlikely. After all, log16(1,000,000) is a little under 5, so your lookup time is not going to grow very quickly. Processing all that data is going to take up much more time than the overhead of the lookup.

As always: profile first. If you somehow have a problem that absolutely needs the fastest possible hash map in the world, you might need an imperative hash map, but for every case I’ve ever had, the functional ones work just fine.




回答2:


You should follow Alexis' suggestion and use unordered-containers. If you really want something that is guaranteed to have Θ(1) lookups, you can define your own frozen version of any of the hash table types from hashtables using unsafePerformIO, but this is not very elegant. For example:

module HT
    ( HT
    , fromList
    , lookup
    ) where

import qualified Data.HashTable.IO as H
import Data.Hashable (Hashable)
import Data.Foldable (toList)
import System.IO.Unsafe (unsafePerformIO)
import Prelude hiding (lookup)

newtype HT k v = HT (H.BasicHashTable k v)

fromList :: (Foldable t, Eq k, Hashable k) => t (k, v) -> HT k v
fromList = HT . unsafePerformIO . H.fromList . toList

lookup :: (Eq k, Hashable k) => HT k v -> k -> Maybe v
lookup (HT h) k = unsafePerformIO $ H.lookup h k

Both uses of unsafePerformIO above should be safe. For that is crucial that the HT is exported as an abstract type.




回答3:


Does there exist some other library which defines types and functions which can construct an immutable constant access query O(1) associative array that is not embedded in a stateful monad?

At this point in time, the answer is still no.

As of late-2019 there is an efficient IO-based hashtable package with decent benchmarks.

What you describe seems doable in the same way that pure, immutable Data.Array construction is possible. See Data.Array.Base for how this is achieved via unsafe* operators. A Data.Array is defined with a bound, and my initial thought is that a pure, immutable hashtable will potentially have GC problems if it's allowed to grow without bounds.



来源:https://stackoverflow.com/questions/39238989/does-a-useful-haskell-hashmap-hashtable-dictionary-library-exist

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!