I am taking python at my college and I am stuck with my current assignment. We are supposed to take 2 files and compare them. I am simply trying to open the files so I can
If you are trying to open a file then you should use the path generated by os
, like so:
import os
os.path.join("path","to","the","file")
It seems that you have problems with characters "\" and "/". If you use them in input - try to change one to another...
The problem is due to bytes data that needs to be decoded.
When you insert a variable into the interpreter, it displays it's repr attribute whereas print() takes the str (which are the same in this scenario) and ignores all unprintable characters such as: \x00, \x01 and replaces them with something else.
A solution is to "decode" file1_content (ignore bytes):
file1_content = ''.join(x for x in file1_content if x.isprintable())
Default encoding of files for Python 3.5 is 'utf-8'.
Default encoding of files for Windows tends to be something else.
If you intend to open two text files, you may try this:
import locale
locale.getdefaultlocale()
file1 = input("Enter the name of the first file: ")
file1_open = open(file1, encoding=locale.getdefaultlocale()[1])
file1_content = file1_open.read()
There should be some automatic detection in the standard library.
Otherwise you may create your own:
def guess_encoding(csv_file):
"""guess the encoding of the given file"""
import io
import locale
with io.open(csv_file, "rb") as f:
data = f.read(5)
if data.startswith(b"\xEF\xBB\xBF"): # UTF-8 with a "BOM"
return "utf-8-sig"
elif data.startswith(b"\xFF\xFE") or data.startswith(b"\xFE\xFF"):
return "utf-16"
else: # in Windows, guessing utf-8 doesn't work, so we have to try
try:
with io.open(csv_file, encoding="utf-8") as f:
preview = f.read(222222)
return "utf-8"
except:
return locale.getdefaultlocale()[1]
and then
file1 = input("Enter the name of the first file: ")
file1_open = open(file1, encoding=guess_encoding(file1))
file1_content = file1_open.read()