plaintext = input(\"Please enter the text you want to compress\")
filename = input(\"Please enter the desired filename\")
You can not serialize a Python 3 'string' to bytes without explict conversion to some encoding.
outfile.write(plaintext.encode('utf-8'))
is possibly what you want. Also this works for both python 2.x and 3.x.
For Django
in django.test.TestCase
unit testing, I changed my Python2 syntax:
def test_view(self):
response = self.client.get(reverse('myview'))
self.assertIn(str(self.obj.id), response.content)
...
To use the Python3 .decode('utf8')
syntax:
def test_view(self):
response = self.client.get(reverse('myview'))
self.assertIn(str(self.obj.id), response.content.decode('utf8'))
...
If you use Python3x then string
is not the same type as for Python 2.x, you must cast it to bytes (encode it).
plaintext = input("Please enter the text you want to compress")
filename = input("Please enter the desired filename")
with gzip.open(filename + ".gz", "wb") as outfile:
outfile.write(bytes(plaintext, 'UTF-8'))
Also do not use variable names like string
or file
while those are names of module or function.
EDIT @Tom
Yes, non-ASCII text is also compressed/decompressed. I use Polish letters with UTF-8 encoding:
plaintext = 'Polish text: ąćęłńóśźżĄĆĘŁŃÓŚŹŻ'
filename = 'foo.gz'
with gzip.open(filename, 'wb') as outfile:
outfile.write(bytes(plaintext, 'UTF-8'))
with gzip.open(filename, 'r') as infile:
outfile_content = infile.read().decode('UTF-8')
print(outfile_content)
There is an easier solution to this problem.
You just need to add a t
to the mode so it becomes wt
. This causes Python to open the file as a text file and not binary. Then everything will just work.
The complete program becomes this:
plaintext = input("Please enter the text you want to compress")
filename = input("Please enter the desired filename")
with gzip.open(filename + ".gz", "wt") as outfile:
outfile.write(plaintext)
For Python 3.x you can convert your text to raw bytes through:
bytes("my data", "encoding")
For example:
bytes("attack at dawn", "utf-8")
The object returned will work with outfile.write
.
>>> s = bytes("s","utf-8")
>>> print(s)
b's'
>>> s = s.decode("utf-8")
>>> print(s)
s
Well if useful for you in case removing annoying 'b' character.If anyone got better idea please suggest me or feel free to edit me anytime in here.I'm just newbie