how to use sed delete Unicode in some range?

后端 未结 2 1670
孤独总比滥情好
孤独总比滥情好 2021-01-05 22:10

I want to remove Unicode in some range, e.g.:

echo \"abcABC123\" | sed \'s/[\\uff21-\\uff3b]//g\'

expect \"abc123\", but get:

相关标签:
2条回答
  • 2021-01-05 22:41

    Not sure why sed is not working, but you can use tr instead

    $ echo 'abcABC123' | tr -d 'A-Z'
    abc123
    


    From man tr

    tr - translate or delete characters

    -d, --delete delete characters in SET1, do not translate

    0 讨论(0)
  • 2021-01-05 22:58

    Unicode support in sed is not well defined. You may be better off using command line perl:

    echo "abcABC123" | perl -CS -pe 's/[\x{FF21}-\x{FF3B}]+//g'
    
    abc123
    

    It is important to use -CS flags here to be able to get correct UTF8 encodings for input/output/error.

    0 讨论(0)
提交回复
热议问题