发表新帖

发表新帖

Best way to convert text files between character sets?

后端未结

关注

 21  2013

再見小時候 2020-11-22 04:42

What is the fastest, easiest tool or method to convert text files between character sets?

Specifically, I need to convert from UTF-8 to ISO-8859-15 and vice versa.

21条回答

囚心锁ツ (楼主)

2020-11-22 05:01
Oneliner using find, with automatic character set detection

The character encoding of all matching text files gets detected automatically and all matching text files are converted to utf-8 encoding:
```
$ find . -type f -iname *.txt -exec sh -c 'iconv -f $(file -bi "$1" |sed -e "s/.*[ ]charset=//") -t utf-8 -o converted "$1" && mv converted "$1"' -- {} \;
```
To perform these steps, a sub shell sh is used with -exec, running a one-liner with the -c flag, and passing the filename as the positional argument "$1" with -- {}. In between, the utf-8 output file is temporarily named converted.

Whereby file -bi means:
- -b, --brief Do not prepend filenames to output lines (brief mode).
- -i, --mime Causes the file command to output mime type strings rather than the more traditional human readable ones. Thus it may say for example text/plain; charset=us-ascii rather than ASCII text. The sed command cuts this to only us-ascii as is required by iconv.
The find command is very useful for such file management automation. Click here for more find galore.
0 讨论(0)

查看其它21个回答
发布评论:

提交评论
- 加载中...

热议问题