问题
After looking into my question here, I found that it was caused by a simpler problem.
When I write "\n"
to a file, I expect to read in "\n"
from the file. This is not always the case in Windows.
In [1]: with open("out", "w") as file:
...: file.write("\n")
...:
In [2]: with open("out", "r") as file:
...: s = file.read()
...:
In [3]: s # I expect "\n" and I get it
Out[3]: '\n'
In [4]: with open("out", "rb") as file:
...: b = file.read()
...:
In [5]: b # I expect b"\n"... Uh-oh
Out[5]: b'\r\n'
In [6]: with open("out", "wb") as file:
...: file.write(b"\n")
...:
In [7]: with open("out", "r") as file:
...: s = file.read()
...:
In [8]: s # I expect "\n" and I get it
Out[8]: '\n'
In [9]: with open("out", "rb") as file:
...: b = file.read()
...:
In [10]: b # I expect b"\n" and I get it
Out[10]: b'\n'
In a more organized way:
| Method of Writing | Method of Reading | "\n" Turns Into |
|-------------------|-------------------|-----------------|
| "w" | "r" | "\n" |
| "w" | "rb" | b"\r\n" |
| "wb" | "r" | "\n" |
| "wb" | "rb" | b"\n" |
When I try this on my Linux virtual machine, it always returns \n. How can I do this in Windows?
Edit:
This is especially problematic with the pandas library, which appears to write DataFrame
s to csv
with "w"
and read csv
s with "rb"
. See the question linked at the top for an example of this.
回答1:
Since you are using Python 3, you're in luck. When you open the file for writing, just specify newline='\n'
to ensure that it writes '\n'
instead of the system default, which is \r\n
on Windows. From the docs:
When writing output to the stream, if
newline
isNone
, any'\n'
characters written are translated to the system default line separator, os.linesep. If newline is''
or'\n'
, no translation takes place. Ifnewline
is any of the other legal values, any'\n'
characters written are translated to the given string.
The reason that you think that you are "sometimes" seeing the two-character output is that when you open the file in binary mode, no conversion is done at all. Byte arrays are just displayed in ASCII whenever possible for your convenience. Don't think of them as real strings until they have been decoded. The binary output you show is the true contents of the file in all your examples.
When you open the file for reading in the default text mode, the newline
parameter will work similarly to how it does for writing. By default all \r\n
in the file will be converted to just \n
after the characters are decoded. This is very nice when your code travels between OSes but your files do not since you can use the exact same code that relies only on \n
. If your files travel too, you should stick to the relatively portable newline='\n'
for at least the output.
回答2:
From the documentation:
newline controls how universal newlines mode works (it only applies to text mode). It can be
None
,''
,'\n'
,'\r'
, and'\r\n'
. It works as follows:[...]
- When writing output to the stream, if newline is
None
, any'\n'
characters written are translated to the system default line separator,os.linesep
. If newline is''
or'\n'
, no translation takes place. If newline is any of the other legal values, any'\n'
characters written are translated to the given string.
open(..., 'w', newline='')
来源:https://stackoverflow.com/questions/47384652/python-write-replaces-n-with-r-n-in-windows