I am trying to remove non-printable character (for e.g. ^@
) from records in my file. Since the volume to records is too big in the file using cat is not an opti
Perhaps you could go with the complement of [:print:]
, which contains all printable characters:
tr -cd '[:print:]' < file > newfile
If your version of tr
doesn't support multi-byte characters (it seems that many don't), this works for me with GNU sed (with UTF-8 locale settings):
sed 's/[^[:print:]]//g' file
strings -1 file... > outputfile
seems to work
Remove all control characters first:
tr -dc '\007-\011\012-\015\040-\376' < file > newfile
Then try your string:
sed -i 's/[^@a-zA-Z 0-9`~!@#$%^&*()_+\[\]\\{}|;'\'':",.\/<>?]//g' newfile
I believe that what you see ^@
is in fact a zero value \0
.
The tr
filter from above will remove those as well.