What is the fastest, easiest tool or method to convert text files between character sets?
Specifically, I need to convert from UTF-8 to ISO-8859-15 and vice versa.>
The character encoding of all matching text files gets detected automatically and all matching text files are converted to utf-8
encoding:
$ find . -type f -iname *.txt -exec sh -c 'iconv -f $(file -bi "$1" |sed -e "s/.*[ ]charset=//") -t utf-8 -o converted "$1" && mv converted "$1"' -- {} \;
To perform these steps, a sub shell sh
is used with -exec
, running a one-liner with the -c
flag, and passing the filename as the positional argument "$1"
with -- {}
. In between, the utf-8
output file is temporarily named converted
.
Whereby file -bi means:
-b
, --brief
Do not prepend filenames to output lines (brief mode).
-i
, --mime
Causes the file command to output mime type strings rather than the more traditional human readable ones. Thus it may say for example text/plain; charset=us-ascii
rather than ASCII text
. The sed
command cuts this to only us-ascii
as is required by iconv
.
The find
command is very useful for such file management automation.
Click here for more find galore.