Best way to convert text files between character sets?

后端 未结 21 2013
再見小時候
再見小時候 2020-11-22 04:42

What is the fastest, easiest tool or method to convert text files between character sets?

Specifically, I need to convert from UTF-8 to ISO-8859-15 and vice versa.

21条回答
  •  囚心锁ツ
    2020-11-22 05:01

    Oneliner using find, with automatic character set detection

    The character encoding of all matching text files gets detected automatically and all matching text files are converted to utf-8 encoding:

    $ find . -type f -iname *.txt -exec sh -c 'iconv -f $(file -bi "$1" |sed -e "s/.*[ ]charset=//") -t utf-8 -o converted "$1" && mv converted "$1"' -- {} \;
    

    To perform these steps, a sub shell sh is used with -exec, running a one-liner with the -c flag, and passing the filename as the positional argument "$1" with -- {}. In between, the utf-8 output file is temporarily named converted.

    Whereby file -bi means:

    • -b, --brief Do not prepend filenames to output lines (brief mode).

    • -i, --mime Causes the file command to output mime type strings rather than the more traditional human readable ones. Thus it may say for example text/plain; charset=us-ascii rather than ASCII text. The sed command cuts this to only us-ascii as is required by iconv.

    The find command is very useful for such file management automation. Click here for more find galore.

提交回复
热议问题