awk ignore delimiter inside single quote within a parenthesis

大兔子大兔子 提交于 2019-12-13 03:04:05

问题


I have a set of data inside the csv as below:

 Given Data:
 (12,'hello','this girl,is lovely(adorable \r\n actually)',goodbye),
 (13,'hello','this fruit,is super tasty (sweet actually)',goodbye)

I want to print the given data into 2 rows starting from ( till ) and ignore delimiter , and () inside the ' ' field.

How can I do this using awk or sed in linux?

Expected result as below:

 Expected Result: 
 row 1 = 12,'hello','this girl,is lovely(adorable actually)',goodbye
 row 2 = 13,'hello','this fruit,is super tasty (sweet actually)',goodbye

UPDATE: I just noticed that there are a comma between the 2 rows. So how can i separate it into 2 rows using the , after ) and before (?


回答1:


You can use the following awk command to achieve your goal:

awk -i.bak '{str=substr($0,2,length($0)-2); gsub("\\\\r ?|\\\\n ?","",str); print "row "NR" = "str;}' file.in

tested on your input:

explanations:

  • -i.bak will take a backup of your file and
  • {str=substr($0,2,length($0)-2); gsub("\\\\r ?|\\\\n ?","",str); print "row "NR" = "str;} will first remove the first and last parenthesis of your string before removing the \r,\n and printing it in the format you want
  • you might need to add before the {...} the following condition if you have a header NR>1 -> 'NR>1{str=substr($0,2,length($0)-2); gsub("\\\\r ?|\\\\n ?","",str); print "row "NR" = "str;}'

following the changes in your requirements, I have adapted the awk command to be able to take into account your , as a record separator (row separator)

awk -i.bak 'BEGIN{RS=",\n|\n"}{str=substr($0,2,length($0)-2); gsub("\\\\r ?|\\\\n ?","",str); print "row "NR" = "str;}' file.in

where BEGIN{RS=",\n|\n"} defines your row separator constraint



来源:https://stackoverflow.com/questions/48372907/awk-ignore-delimiter-inside-single-quote-within-a-parenthesis

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!