Huffman Tree decoding

我怕爱的太早我们不能终老 提交于 2020-01-26 04:58:26

问题


Given a Huffman tree and a stream of bits, return a pair containing (1) the -- string of symbols encoded by the bits (according to the Huffman tree), and -- (2) a Bool indicating whether the output stream contains every bit from the -- input (that is, return False if there were any bits left over).

Here is the code, it only returns the first symbol in the tree. What's the problem?

data BTree a = Leaf a | Fork (BTree a) (BTree a) deriving (Show, Eq)

traT :: BTree a -> BTree a -> [Bool] -> [a] -> ([a], Bool)
traT (Leaf v) c bs res= (res++[v], True)
traT (Fork left right) c (b:bs) res
   | b         = traT right c bs res
   | otherwise = traT left c bs res
traT _ c [] res = (res, True)
traT _ c bs res = traT c c (bs) res
traT _ c bs res = (res, False)

decode :: BTree a -> [Bool] -> ([a], Bool)
decode (Fork x y) bs = traT (Fork x y) (Fork x y) bs []
decode (Leaf x) bs = traT(Leaf x) (Leaf x) bs []

回答1:


Well, you seem to be on the right track.

it only returns the first symbol in the tree.

Your main problem is with these 2 lines:

traT (Leaf v) c bs res= (res++[v], True)
...
traT _ c bs res = traT c c (bs) res

The first one masks the second one for all leaf nodes. And the second one is your only forward recursive call that could operate at leaf nodes, hence your only hope to process any further bits.

A couple of remarks:

  1. the res++[v] expression forces the code to rescan the whole symbol list at each new symbol.
  2. The second line would call itself endlessly (but it is masked by the first one).

Another (smaller) problem is that returning just one flag for the presence of "extra" bits at the end of the bit stream loses information, as we would like to know what the extra bits are. It is a bit risky to do this in your core recursive function. Of course, it is perfectly OK to do it in the final outer decode function.

This is why in the code sample below, I have used an extra symBits argument to keep the bits that have been processed but not yet attributed to a symbol. I keep them in reverse order, because Haskell prefers to prepend items to a list, rather than to put them at the end, rescanning the whole list to do so. Hence the call to reverse in the final stage of processing. It is a cheap reverse call, as it is limited in length to the depth of our Huffman tree.

So here is some suggested reworked code, where I have tried to distinguish the 4 cases: leaf node or fork node AND at end of bit stream or not. I also took the liberty to rename your c argument as htop.

data BTree a = Leaf a | Fork (BTree a) (BTree a)  deriving (Show, Eq)

type Bit = Bool


--            hnode   htop     symBits    bs
travHT :: BTree a -> BTree a -> [Bit] -> [Bit] -> ([a], [Bit])

-- situations where at least one input bit remains:
travHT (Leaf v) htop symBits (b:rbs) =  -- CHANGE: forward recursive call
                     -- symbol completed, jump from leaf node to top of htree:
                     let  fwdRes      = travHT htop htop [] (b:rbs)
                          nextSyms    = fst fwdRes
                          lastSymBits = snd fwdRes
                     in   (v : nextSyms, lastSymBits)
travHT (Fork left right) htop symBits (b:rbs)
   | b          = travHT right htop  (b:symBits)  rbs
   | otherwise  = travHT left htop   (b:symBits)  rbs

-- situations where we have reached the end of the bit stream:
travHT (Leaf v)           htop  symBits [] = ([v],[])
--   no more bits and not at a leaf --> incomplete last symbol:
travHT (Fork left right)  htop  symBits [] = ([], reverse symBits)

-- homework-mandated interface:
decode :: BTree a -> [Bit] -> ([a], Bool)
decode htree bs =
   let pair = travHT htree htree [] bs
       (symbols, restOfBits) = pair
       weUsedAllBits = null restOfBits
   in  (symbols, weUsedAllBits)

Testing code with token main program:

xyz_code :: BTree Char
xyz_code = Fork (Leaf 'x') (Fork (Leaf 'y') (Leaf 'z'))

-- Bit streams for test purposes:
------      Y           Z          X       X       X      Y/Z??
bl0 = [True,False,  True,True  , False,  False,  False]
bl1 = [True,False,  True,True  , False,  False,  False,  True]


main = do
    let bitList = bl0
    let htree   = xyz_code
    let result = decode  htree  bitList
    putStrLn $ "result = " ++ show result

Program output:

result = ("yzxxx",True)

Hope it helps. I will also ask the powers that be to add the [huffman-code] tag to your question. Tags are a nice way to help people find the questions of interest to them. And we do have a tag for Huffman codes.



来源:https://stackoverflow.com/questions/58567971/huffman-tree-decoding

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!