Trying to delete non-ASCII characters only [duplicate]

主宰稳场 提交于 2020-01-21 03:11:06

问题


I am trying to manipulate a text file and remove non-ASCII characters from the text. I don't want to remove the line. I only want to remove the offending characters. I am trying to get the following expression to work:

sed '/[\x80-\xFF]/d'


回答1:


The suggested solutions may fail with specific version of sed, e.g. GNU sed 4.2.1.

Using tr:

tr -cd '[:print:]' < yourfile.txt

This will remove any characters not in [\x20-\x7e].

If you want to keep e.g. line feeds, just add \n:

tr -cd '[:print:]\n' < yourfile.txt

If you really want to keep all ASCII characters (even the control codes):

tr -cd '[:print:][:cntrl:]' < yourfile.txt

This will remove any characters not in [\x00-\x7f].



来源:https://stackoverflow.com/questions/15034944/trying-to-delete-non-ascii-characters-only

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!