问题
i need to remove all multibyte characters from a file, i dont know what they are so i need to cover the whole range.
I can find them using grep like so: grep -P "[\x80-\xFF]" 'myfile'
Trying to do a simular thing with sed, but delete them instead.
Cheers
回答1:
Give this a try:
LANG=C sed 's/[\x80-\xFF]//g' filename
回答2:
you can use iconv to convert from one encoding to another
来源:https://stackoverflow.com/questions/3521106/removing-multibyte-characters-from-a-file-using-sed