I\'m trying to implement the levenshtein distance (or edit distance) in Haskell, but its performance decreases rapidly when the string lenght increases.
I\'m still q
This version is much quicker than those memorised versions, but still I would love to have it even quicker. Works fine with 100's character long strings. I was written with other distances in mind(change the init function and cost) , and use classical dynamic programming array trick. The long line could be converted into a separate function with top 'do', but I like this way.
import Data.Array.IO
import System.IO.Unsafe
editDistance = dist ini med
dist :: (Int -> Int -> Int) -> (a -> a -> Int ) -> [a] -> [a] -> Int
dist i f a b = unsafePerformIO $ distM i f a b
-- easy to create other distances
ini i 0 = i
ini 0 j = j
ini _ _ = 0
med a b = if a == b then 0 else 2
distM :: (Int -> Int -> Int) -> (a -> a -> Int) -> [a] -> [a] -> IO Int
distM ini f a b = do
let la = length a
let lb = length b
arr <- newListArray ((0,0),(la,lb)) [ini i j | i<- [0..la], j<-[0..lb]] :: IO (IOArray (Int,Int) Int)
-- all on one line
mapM_ (\(i,j) -> readArray arr (i-1,j-1) >>= \ld -> readArray arr (i-1,j) >>= \l -> readArray arr (i,j-1) >>= \d-> writeArray arr (i,j) $ minimum [l+1,d+1, ld + (f (a !! (i-1) ) (b !! (j-1))) ] ) [(i,j)| i<-[1..la], j<-[1..lb]]
readArray arr (la,lb)