Using cut in bash on a file with a unique deliminter

十年热恋 提交于 2020-01-01 12:32:07

问题


Can cut be used in bash with the ¬ delimiter?

This question is an extension of the topic covered here. One interpretation of the goal in that link is to use a delimiter that can not be found (or very rarely found) in human text. Say we choose the 'Not Sign' (¬) as a delimiter. My question is regarding the use of cut to pull specific columns of a file with said delimiter.

For example, say that we create a file with the ¬ delimiter. The file prac.txt might look like:

$cat prac.txt
"Billy""Car"¬"Red"¬"Garage"¬"3"
"Rob"¬"Truck"¬"Blue"¬"Street"¬"14" 

The following process produces an error:

$cut -d'¬' -f1 prac.txt  
cut: the delimiter must be a single character
Try `cut --help' for more information.

The correct output would be:

"Billy"
"Rob"

Possibly useful info from python:

import unicodedata
>>>unicodedata.lookup('Not sign')
u'\xac'

Possibly useful character conversion link.

My guess is that the -d flag uses some representation of '¬' that I have not tried yet or else it only works with single ascii characters. Thanks in advance for any help.


回答1:


In UTF-8, the "not sign" is encoded in two bytes c2 ac. and cut doesn't handle this, which is arguably a bug. See this discussion on unix.stackexchange.



来源:https://stackoverflow.com/questions/19821639/using-cut-in-bash-on-a-file-with-a-unique-deliminter

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!