I\'m trying to use something in bash to show me the line endings in a file printed rather than interpreted. The file is a dump from SSIS/SQL Server being read in by a Linux
You may use the command todos filename
to convert to DOS endings, and fromdos filename
to convert to UNIX line endings. To install the package on Ubuntu, type sudo apt-get install tofrodos
.
You can use xxd
to show a hex dump of the file, and hunt through for "0d0a" or "0a" chars.
You can use cat -v <filename>
as @warriorpostman suggests.
In the bash shell, try cat -v <filename>
. This should display carriage-returns for windows files.
(This worked for me in rxvt via Cygwin on Windows XP).
Editor's note: cat -v
visualizes \r
(CR) chars. as ^M
. Thus, line-ending \r\n
sequences will display as ^M
at the end of each output line. cat -e
will additionally visualize \n
, namely as $
. (cat -et
will additionally visualize tab chars. as ^I
.)
^M
If you prefer to always see the Windows newlines in vim render as ^M
, you can add this line to your .vimrc
:
set ffs=unix
This will make vim interpret every file you open as a unix file. Since unix files have \n
as the newline character, a windows file with a newline character of \r\n
will still render properly (thanks to the \n
) but will have ^M
at the end of the file (which is how vim renders the \r
character).
If you'd prefer just to set it on a per-file basis, you can use :e ++ff=unix
when editing a given file.
unix
vs dos
)If you want the bottom line of vim to always display what filetype you're editing (and you didn't force set the filetype to unix) you can add to your statusline
with
set statusline+=\ %{&fileencoding?&fileencoding:&encoding}
.
My full statusline is provided below. Just add it to your .vimrc
.
" Make statusline stay, otherwise alerts will hide it
set laststatus=2
set statusline=
set statusline+=%#PmenuSel#
set statusline+=%#LineNr#
" This says 'show filename and parent dir'
set statusline+=%{expand('%:p:h:t')}/%t
" This says 'show filename as would be read from the cwd'
" set statusline+=\ %f
set statusline+=%m\
set statusline+=%=
set statusline+=%#CursorColumn#
set statusline+=\ %y
set statusline+=\ %{&fileencoding?&fileencoding:&encoding}
set statusline+=\[%{&fileformat}\]
set statusline+=\ %p%%
set statusline+=\ %l:%c
set statusline+=\
It'll render like
.vim/vimrc\ [vim] utf-8[unix] 77% 315:6
at the bottom of your file
unix
vs dos
)If you just want to see what type of file you have, you can use :set fileformat
(this will not work if you've force set the filetype). It will return unix
for unix files and dos
for Windows.
In vi
...
:set list
to see line-endings.
:set nolist
to go back to normal.
While I don't think you can see \n
or \r\n
in vi
, you can see which type of file it is (UNIX, DOS, etc.) to infer which line endings it has...
:set ff
Alternatively, from bash
you can use od -t c <filename>
or just od -c <filename>
to display the returns.
file
then file -k
then dos2unix -ih
file
will usually be enough. But for tough cases try file -k
or dosunix -ih
.
Details below.
file -k
Short version: file -k somefile.txt
will tell you.
with CRLF line endings
for DOS/Windows line endings.with LF line endings
for MAC line endings.text
. (So if it does not explicitly mention any kind of line endings
then this implicitly means: "CR line endings".)Long version see below.
I sometimes have to check this for PEM certificate files.
The trouble with regular file
is this: Sometimes it's trying to be too smart/too specific.
Let's try a little quiz: I've got some files. And one of these files has different line endings. Which one?
(By the way: this is what one of my typical "certificate work" directories looks like.)
Let's try regular file
:
$ file -- *
0.example.end.cer: PEM certificate
0.example.end.key: PEM RSA private key
1.example.int.cer: PEM certificate
2.example.root.cer: PEM certificate
example.opensslconfig.ini: ASCII text
example.req: PEM certificate request
Huh. It's not telling me the line endings. And I already knew that those were cert files. I didn't need "file" to tell me that.
What else can you try?
You might try dos2unix
with the --info
switch like this:
$ dos2unix --info -- *
37 0 0 no_bom text 0.example.end.cer
0 27 0 no_bom text 0.example.end.key
0 28 0 no_bom text 1.example.int.cer
0 25 0 no_bom text 2.example.root.cer
0 35 0 no_bom text example.opensslconfig.ini
0 19 0 no_bom text example.req
So that tells you that: yup, "0.example.end.cer" must be the odd man out. But what kind of line endings are there? Do you know the dos2unix output format by heart? (I don't.)
But fortunately there's the --keep-going
(or -k
for short) option in file
:
$ file --keep-going -- *
0.example.end.cer: PEM certificate\012- , ASCII text, with CRLF line terminators\012- data
0.example.end.key: PEM RSA private key\012- , ASCII text\012- data
1.example.int.cer: PEM certificate\012- , ASCII text\012- data
2.example.root.cer: PEM certificate\012- , ASCII text\012- data
example.opensslconfig.ini: ASCII text\012- data
example.req: PEM certificate request\012- , ASCII text\012- data
Excellent! Now we know that our odd file has DOS (CRLF
) line endings. (And the other files have Unix (LF
) line endings. This is not explicit in this output. It's implicit. It's just the way file
expects a "regular" text file to be.)
(If you wanna share my mnemonic: "L" is for "Linux" and for "LF".)
Now let's convert the culprit and try again:
$ dos2unix -- 0.example.end.cer
$ file --keep-going -- *
0.example.end.cer: PEM certificate\012- , ASCII text\012- data
0.example.end.key: PEM RSA private key\012- , ASCII text\012- data
1.example.int.cer: PEM certificate\012- , ASCII text\012- data
2.example.root.cer: PEM certificate\012- , ASCII text\012- data
example.opensslconfig.ini: ASCII text\012- data
example.req: PEM certificate request\012- , ASCII text\012- data
Good. Now all certs have Unix line endings.
dos2unix -ih
I didn't know this when I was writing the example above but:
Actually it turns out that dos2unix will give you a header line if you use -ih
(short for --info=h
) like so:
$ dos2unix -ih -- *
DOS UNIX MAC BOM TXTBIN FILE
0 37 0 no_bom text 0.example.end.cer
0 27 0 no_bom text 0.example.end.key
0 28 0 no_bom text 1.example.int.cer
0 25 0 no_bom text 2.example.root.cer
0 35 0 no_bom text example.opensslconfig.ini
0 19 0 no_bom text example.req
And another "actually" moment: The header format is really easy to remember: Here's two mnemonics: