I have a requirement to select the 7th column from a tab delimited file. eg:
cat filename | awk \'{print $7}\'
The issue is that the data in th
If the data is unambiguously tab-separated, then cut
will cut on tabs, not spaces:
cut -f7 filename
You can certainly do that with awk
, too:
awk -F'\t' '{ print $7 }'
Judging by the format of your input file, you can get away with delimiting on -
instead of spaces:
awk 'BEGIN{FS="-"} {print $2}' filename
FS
stands for Field Separator, just think of it as the delimiter for input.-
, your 7th field before now becomes the 2nd field.filename
as an argument to awk instead.Alternatively, if your data fields are separated by tabs, you can do it more explicitly as follows:
awk 'BEGIN{FS="\t"} {print $7}' filename
And this will resolve the issue since Out Global Doc Mark
looks to be separated by spaces.
This might work for you (GNU sed):
sed -r 's/(([^\t]*)\t?){7}.*/\2/' file
This substitute command selects everything in the line and returns the 7th non-tab. In sed
the last thing grouped by (...)
will be returned in the lefthand side of the substitution by using a back-reference. In this case the first back-reference would return both the non-tab characters and the tab character (if present N.B. the ?
meta-character which either one or none of the proceeding pattern).The .*
just swallows up what was left on the line if any.
If fields are separated by tabs and your concern is that some fields contain spaces, there is no problem here, just:
cut -f 7
(cut defaults to tab delimited fields.)