Using awk to extract a column containing spaces

不羁岁月 提交于 2019-12-12 03:28:18

问题


I'm looking for a way to extract the filename column from the below output.

    2016-02-03 08:22:33     610540 vendor_20160202_67536242.WAV
    2016-02-03 08:19:25     530916 vendor_20160202_67536349.WAV
    2016-02-03 08:17:10    2767824 vendor_20160201_67369072 - cb.mp3
    2016-02-03 08:17:06     368928 vendor_20160201_67369072.mp3

One of the files has spaces in the name which is causing issues with my current commmand

awk '{print $4}'

How would I treat a column with spaces as a single column?


回答1:


awk to the rescue!

$ awk '{for(i=4;i<NF;i++) printf "%s", $i OFS; 
        printf "%s", $NF ORS}' file

vendor_20160202_67536242.WAV
vendor_20160202_67536349.WAV
vendor_20160201_67369072 - cb.mp3
vendor_20160201_67369072.mp3

or alternatively,

$ awk '{for(i=5;i<=NF;i++) $4=$4 OFS $i; print $4}' file   

if your file format is fixed perhaps using the structure is a better idea

$ cut -c36- file

vendor_20160202_67536242.WAV
vendor_20160202_67536349.WAV
vendor_20160201_67369072 - cb.mp3
vendor_20160201_67369072.mp3



回答2:


You could just delete the first 3 space-then-nonspace blocks:

$ awk '{sub(/^[[:space:]]*([^[:space:]]+[[:space:]]+){3}/,"")}1' file
vendor_20160202_67536242.WAV
vendor_20160202_67536349.WAV
vendor_20160201_67369072 - cb.mp3
vendor_20160201_67369072.mp3

but it looks like you have fixed width fields so to print the last "field" you could just do:

$ awk '{print substr($0,32)}' file
vendor_20160202_67536242.WAV
vendor_20160202_67536349.WAV
vendor_20160201_67369072 - cb.mp3
vendor_20160201_67369072.mp3

but in general use GNU awk's FIELDWIDTHS:

$ gawk -v FIELDWIDTHS='10 9 11 9999' '
     {for (i=1;i<=NF;i++) { gsub(/^ +| +$/,"",$i); print NR, NF, i, "<" $i ">" } print "---"}
  ' file
1 4 1 <2016-02-03>
1 4 2 <08:22:33>
1 4 3 <610540>
1 4 4 <vendor_20160202_67536242.WAV>
---
2 4 1 <2016-02-03>
2 4 2 <08:19:25>
2 4 3 <530916>
2 4 4 <vendor_20160202_67536349.WAV>
---
3 4 1 <2016-02-03>
3 4 2 <08:17:10>
3 4 3 <2767824>
3 4 4 <vendor_20160201_67369072 - cb.mp3>
---
4 4 1 <2016-02-03>
4 4 2 <08:17:06>
4 4 3 <368928>
4 4 4 <vendor_20160201_67369072.mp3>
---


来源:https://stackoverflow.com/questions/35178883/using-awk-to-extract-a-column-containing-spaces

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!