I want to remove Unicode in some range, e.g.:
echo \"abcABC123\" | sed \'s/[\\uff21-\\uff3b]//g\'
expect \"abc123\"
, but get:
Not sure why sed
is not working, but you can use tr
instead
$ echo 'abcABC123' | tr -d 'A-Z'
abc123
From man tr
tr - translate or delete characters
-d, --delete delete characters in SET1, do not translate
Unicode support in sed
is not well defined. You may be better off using command line perl
:
echo "abcABC123" | perl -CS -pe 's/[\x{FF21}-\x{FF3B}]+//g'
abc123
It is important to use -CS
flags here to be able to get correct UTF8 encodings for input/output/error.