Python Write Replaces “\n” With “\r\n” in Windows

妖精的绣舞 提交于 2019-12-10 09:24:56

问题


After looking into my question here, I found that it was caused by a simpler problem.

When I write "\n" to a file, I expect to read in "\n" from the file. This is not always the case in Windows.

In [1]: with open("out", "w") as file:
   ...:     file.write("\n")
   ...:

In [2]: with open("out", "r") as file:
   ...:     s = file.read()
   ...:

In [3]: s  # I expect "\n" and I get it
Out[3]: '\n'

In [4]: with open("out", "rb") as file:
   ...:     b = file.read()
   ...:

In [5]: b  # I expect b"\n"... Uh-oh
Out[5]: b'\r\n'

In [6]: with open("out", "wb") as file:
   ...:     file.write(b"\n")
   ...:

In [7]: with open("out", "r") as file:
   ...:     s = file.read()
   ...:

In [8]: s  # I expect "\n" and I get it
Out[8]: '\n'

In [9]: with open("out", "rb") as file:
   ...:     b = file.read()
   ...:

In [10]: b  # I expect b"\n" and I get it
Out[10]: b'\n'

In a more organized way:

| Method of Writing | Method of Reading | "\n" Turns Into |
|-------------------|-------------------|-----------------|
| "w"               | "r"               | "\n"            |
| "w"               | "rb"              | b"\r\n"         |
| "wb"              | "r"               | "\n"            |
| "wb"              | "rb"              | b"\n"           |

When I try this on my Linux virtual machine, it always returns \n. How can I do this in Windows?

Edit: This is especially problematic with the pandas library, which appears to write DataFrames to csv with "w" and read csvs with "rb". See the question linked at the top for an example of this.


回答1:


Since you are using Python 3, you're in luck. When you open the file for writing, just specify newline='\n' to ensure that it writes '\n' instead of the system default, which is \r\n on Windows. From the docs:

When writing output to the stream, if newline is None, any '\n' characters written are translated to the system default line separator, os.linesep. If newline is '' or '\n', no translation takes place. If newline is any of the other legal values, any '\n' characters written are translated to the given string.

The reason that you think that you are "sometimes" seeing the two-character output is that when you open the file in binary mode, no conversion is done at all. Byte arrays are just displayed in ASCII whenever possible for your convenience. Don't think of them as real strings until they have been decoded. The binary output you show is the true contents of the file in all your examples.

When you open the file for reading in the default text mode, the newline parameter will work similarly to how it does for writing. By default all \r\n in the file will be converted to just \n after the characters are decoded. This is very nice when your code travels between OSes but your files do not since you can use the exact same code that relies only on \n. If your files travel too, you should stick to the relatively portable newline='\n' for at least the output.




回答2:


From the documentation:

newline controls how universal newlines mode works (it only applies to text mode). It can be None, '', '\n', '\r', and '\r\n'. It works as follows:

[...]

  • When writing output to the stream, if newline is None, any '\n' characters written are translated to the system default line separator, os.linesep. If newline is '' or '\n', no translation takes place. If newline is any of the other legal values, any '\n' characters written are translated to the given string.
open(..., 'w', newline='')


来源:https://stackoverflow.com/questions/47384652/python-write-replaces-n-with-r-n-in-windows

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!