I\'m trying to implement the levenshtein distance (or edit distance) in Haskell, but its performance decreases rapidly when the string lenght increases.
This version is much quicker than those memorised versions, but still I would love to have it even quicker. Works fine with 100's character long strings. I was written with other distances in mind(change the init function and cost) , and use classical dynamic programming array trick. The long line could be converted into a separate function with top 'do', but I like this way.
import Data.Array.IO
import System.IO.Unsafe
editDistance = dist ini med
dist :: (Int -> Int -> Int) -> (a -> a -> Int ) -> [a] -> [a] -> Int
dist i f a b = unsafePerformIO $ distM i f a b
-- easy to create other distances
ini i 0 = i
ini 0 j = j
ini _ _ = 0
med a b = if a == b then 0 else 2
distM :: (Int -> Int -> Int) -> (a -> a -> Int) -> [a] -> [a] -> IO Int
distM ini f a b = do
let la = length a
let lb = length b
arr <- newListArray ((0,0),(la,lb)) [ini i j | i<- [0..la], j<-[0..lb]] :: IO (IOArray (Int,Int) Int)
-- all on one line
mapM_ (\(i,j) -> readArray arr (i-1,j-1) >>= \ld -> readArray arr (i-1,j) >>= \l -> readArray arr (i,j-1) >>= \d-> writeArray arr (i,j) $ minimum [l+1,d+1, ld + (f (a !! (i-1) ) (b !! (j-1))) ] ) [(i,j)| i<-[1..la], j<-[1..lb]]
readArray arr (la,lb)