Line reading chokes on 0x1A

后端 未结 2 1089
[愿得一人]
[愿得一人] 2020-11-28 14:37

I have the following file:

abcde
kwakwa
<0x1A>
line3
linllll

Where <0x1A> represents a byte with the hex value of

相关标签:
2条回答
  • 2020-11-28 15:12

    0x1A is Ctrl-Z, and DOS historically used that as an end-of-file marker. For example, try using a command prompt, and "type"ing your file. It will only display the content up the Ctrl-Z.

    Python uses the Windows CRT function _wfopen, which implements the "Ctrl-Z is EOF" semantics.

    0 讨论(0)
  • 2020-11-28 15:13

    Ned is of course correct.

    If your curiosity runs a little deeper, the root cause is backwards compatibility taken to an extreme. Windows is compatible with DOS, which used Ctrl-Z as an optional end of file marker for text files. What you might not know is that DOS was compatible with CP/M, which was popular on small computers before the PC. CP/M's file system didn't keep track of file sizes down to the byte level, it only kept track by the number of floppy disk sectors. If your file wasn't an exact multiple of 128 bytes, you needed a way to mark the end of the text. This Wikipedia article implies that the selection of Ctrl-Z was based on an even older convention used by DEC.

    0 讨论(0)
提交回复
热议问题