问题
Can cut
be used in bash with the ¬
delimiter?
This question is an extension of the topic covered here. One interpretation of the goal in that link is to use a delimiter that can not be found (or very rarely found) in human text. Say we choose the 'Not Sign' (¬
) as a delimiter. My question is regarding the use of cut
to pull specific columns of a file with said delimiter.
For example, say that we create a file with the ¬
delimiter. The file prac.txt might look like:
$cat prac.txt
"Billy""Car"¬"Red"¬"Garage"¬"3"
"Rob"¬"Truck"¬"Blue"¬"Street"¬"14"
The following process produces an error:
$cut -d'¬' -f1 prac.txt
cut: the delimiter must be a single character
Try `cut --help' for more information.
The correct output would be:
"Billy"
"Rob"
Possibly useful info from python:
import unicodedata
>>>unicodedata.lookup('Not sign')
u'\xac'
Possibly useful character conversion link.
My guess is that the -d
flag uses some representation of '¬' that I have not tried yet or else it only works with single ascii characters. Thanks in advance for any help.
回答1:
In UTF-8, the "not sign" is encoded in two bytes c2 ac
. and cut
doesn't handle this, which is arguably a bug. See this discussion on unix.stackexchange.
来源:https://stackoverflow.com/questions/19821639/using-cut-in-bash-on-a-file-with-a-unique-deliminter