removing multibyte characters from a file using sed

帅比萌擦擦* 提交于 2020-12-05 07:46:11

问题


i need to remove all multibyte characters from a file, i dont know what they are so i need to cover the whole range.

I can find them using grep like so: grep -P "[\x80-\xFF]" 'myfile'

Trying to do a simular thing with sed, but delete them instead.

Cheers


回答1:


Give this a try:

LANG=C sed 's/[\x80-\xFF]//g' filename



回答2:


you can use iconv to convert from one encoding to another



来源:https://stackoverflow.com/questions/3521106/removing-multibyte-characters-from-a-file-using-sed

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!