LZW decompression algorithm

后端 未结 2 646
我寻月下人不归
我寻月下人不归 2021-02-10 21:15

I\'m writing a program for an assignment which has to implement LZW compression/decompression. I\'m using the following algorithms for this:

-compression



        
相关标签:
2条回答
  • 2021-02-10 22:13

    From http://en.wikipedia.org/wiki/Lempel%E2%80%93Ziv%E2%80%93Welch are you falling into this case?

    What happens if the decoder receives a code Z that is not yet in its dictionary? Since the decoder is always just one code behind the encoder, Z can be in the encoder's dictionary only if the encoder just generated it, when emitting the previous code X for χ. Thus Z codes some ω that is χ + ?, and the decoder can determine the unknown character as follows:

    1) The decoder sees X and then Z.
    2) It knows X codes the sequence χ and Z codes some unknown sequence ω.
    3) It knows the encoder just added Z to code χ + some unknown character,
    4) and it knows that the unknown character is the first letter z of ω.
    5) But the first letter of ω (= χ + ?) must then also be the first letter of χ.
    6) So ω must be χ + x, where x is the first letter of χ.
    7) So the decoder figures out what Z codes even though it's not in the table,
    8) and upon receiving Z, the decoder decodes it as χ + x, and adds χ + x to the table as the value of Z.
    

    This situation occurs whenever the encoder encounters input of the form cScSc, where c is a single character, S is a string and cS is already in the dictionary, but cSc is not. The encoder emits the code for cS, putting a new code for cSc into the dictionary. Next it sees cSc in the input (starting at the second c of cScSc) and emits the new code it just inserted. The argument above shows that whenever the decoder receives a code not in its dictionary, the situation must look like this.

    Although input of form cScSc might seem unlikely, this pattern is fairly common when the input stream is characterized by significant repetition. In particular, long strings of a single character (which are common in the kinds of images LZW is often used to encode) repeatedly generate patterns of this sort.


    For this specific case, the wikipedia thing fits, you have X+? where X is (o), Z is unknown so far so the first letter is X giving (oo) add (oo) to the table as 257. I am just going on what I read at wikipedia, let us know how this turns out if that is not the solution.

    0 讨论(0)
  • 2021-02-10 22:22

    Your compression part is right and complete but the decompression part is not complete. You only include the case when the code is in the dictionary. Since the decompression process is always one step behind the compression process, there is the possibility when the decoder find a code which is not in the dictionary. But since it's only one step behind, it can figure out what the encoding process will add next and correctly output the decoded string, then add it to the dictionary. To continue your decompression process like this:

    -decompression

    read a character k;
       output k;
       w = k;
       while ( read a character k )    
      /* k could be a character or a code. */
            {
             if k exists in the dictionary
             entry = dictionary entry for k;
             output entry;
             add w + entry[0] to dictionary;
             w = entry;
             else
             output entry = w + firstCharacterOf(w);
             add entry to dictionary;
             w = entry;
            }
    

    Then when you come to decompress the file and see 257, you find it's not there in the dictionary. But you know the previous entry is 'o' and it's first character is 'o' too, put them together, you get "oo". Now output oo and add it to dictionary. Next you get code 112 and sure you know it's p. DONE!

    w       k          entry        output       Dictionary
            98 (b)                  b   
    b       111 (o)    o            o             bo (256)
    o       257 (oo)                oo            oo(257)
    oo      112(p)                  p
    

    See: this explanation by Steve Blackstock for more information. A better page with flow chart for the actual decoder and encoder implementation on which the "icafe" Java image library GIF encoder and decoder are based.

    0 讨论(0)
提交回复
热议问题