How to convert Windows end of line in Unix end of line (CR/LF to LF)

后端 未结 8 719
臣服心动
臣服心动 2020-11-29 18:59

I\'m a Java developer and I\'m using Ubuntu to develop. The project was created in Windows with Eclipse and it\'s using the Windows-1252 encoding.

To convert to UTF-8

相关标签:
8条回答
  • 2020-11-29 19:17

    Go back to Windows, tell Eclipse to change the encoding to UTF-8, then back to Unix and run d2u on the files.

    0 讨论(0)
  • 2020-11-29 19:19

    Actually, vim does allow what you're looking for. Enter vim, and type the following commands:

    :args **/*.java
    :argdo set ff=unix | update | next
    

    The first of these commands sets the argument list to every file matching **/*.java, which is all Java files, recursively. The second of these commands does the following to each file in the argument list, in turn:

    • Sets the line-endings to Unix style (you already know this)
    • Writes the file out iff it's been changed
    • Proceeds to the next file
    0 讨论(0)
  • 2020-11-29 19:30

    In order to overcome

    Ambiguous output in step `CR-LF..data'
    

    simply solution might be to add -f flag to force conversion.

    0 讨论(0)
  • 2020-11-29 19:32

    There should be a program called dos2unix that will fix line endings for you. If it's not already on your Linux box, it should be available via the package manager.

    0 讨论(0)
  • 2020-11-29 19:35

    Did you try the python script by Bryan Maupin found here ? (I've modified it a little bit to be more generic)

    #!/usr/bin/env python
    
    import sys
    
    input_file_name = sys.argv[1]
    output_file_name = sys.argv[2]
    
    input_file = open(input_file_name)
    output_file = open(output_file_name, 'w')
    
    line_number = 0
    
    for input_line in input_file:
        line_number += 1
        try:  # first try to decode it using cp1252 (Windows, Western Europe)
            output_line = input_line.decode('cp1252').encode('utf8')
        except UnicodeDecodeError, error:  # if there's an error
            sys.stderr.write('ERROR (line %s):\t%s\n' % (line_number, error))  # write to stderr
            try:  # then if that fails, try to decode using latin1 (ISO 8859-1)         
                output_line = input_line.decode('latin1').encode('utf8')
            except UnicodeDecodeError, error:  # if there's an error
                sys.stderr.write('ERROR (line %s):\t%s\n' % (line_number, error))  # write to stderr
                sys.exit(1)  # and just keep going
        output_file.write(output_line)
    
    input_file.close()
    output_file.close()
    

    You can use that script with

    $ ./cp1252_utf8.py file_cp1252.sql file_utf8.sql
    
    0 讨论(0)
  • 2020-11-29 19:37

    The tr command can also do this:

    tr -d '\15\32' < winfile.txt > unixfile.txt
    

    and should be available to you.

    You'll need to run tr from within a script, since it cannot work with file names. For example, create a file myscript.sh:

    #!/bin/bash
    
    for f in `find -iname \*.java`; do
        echo "$f"
        tr -d '\15\32' < "$f" > "$f.tr"
        mv "$f.tr" "$f"
        recode CP1252...UTF-8 "$f"
    done
    

    Running myscript.sh would process all the java files in the current directory and its subdirectories.

    0 讨论(0)
提交回复
热议问题