Python Compression Run Length encoding

六月ゝ 毕业季﹏ 提交于 2021-02-05 05:57:47

问题


I am trying to learn about run length encoding and I found this challenge online that I cant do. It requires you to write a compression function called compression(strg) that takes a binary string strg of length 64 as input and returns another binary string as output. The output binary string should be a run-length encoding of the input string.

compression('1010101001010101101010100101010110101010010101011010101001010101')

'1010101001010101*4'

Here is what I have, but this does NOT find the pattern:

from itertools import *

def compression(strg):
    return [(len(list(group)),name) for name, group in groupby(strg)]

I need some help solving this.


回答1:


I believe that you are conflating RLE with Lempel/Ziv sliding window compression.

RLE strictly works on repeated characters: WWWWWWWW => W8

LZ has a sliding window that will pick up patterns as you describe.

David MacKay's site has example compression codes in Python, including LZ




回答2:


This is an example of a longest repeated substring problem. It is classically solved with a suffix tree data structure.

For short strings, you can use a form of a regex:

import re

s1='1010101001010101101010100101010110101010010101011010101001010101'

i=2
l=s1
j=len(l)/2
while i<len(s1):
    m=re.search('^(.{'+str(j)+'})\\1$',l)
    if m:
        l=m.group(1)
        i,j=i+1,len(l)/2
        continue
    else:
        print '{0} * {1} = {2}'.format(l,i,s1)
        break

Prints your output. Note this only works for strings that have complete symmetry from the middle -- a small subset of this type of problem. To compress other types of strings, you would need a representational grammar of how the replaced elements are being substituted.




回答3:


Answer of this question with detail explanation are given in the following link:

Image compression by def compress(S) function using run-length codig

Hope it will clear your understanding of run length encoding of string and binary compression. This coding is done without using importing any re and itertools.



来源:https://stackoverflow.com/questions/12980229/python-compression-run-length-encoding

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!