I have a file with many lines in each line there are many columns(fields) separated by blank " " the numbers of columns in each line are different I want to remove the first two columns how to?
You can do it with cut
:
cut -d " " -f 3- input_filename > output_filename
Explanation:
cut
: invoke the cut command-d " "
: use a single space as the delimiter (cut
uses TAB by default)-f
: specify fields to keep3-
: all the fields starting with field 3input_filename
: use this file as the input> output_filename
: write the output to this file.
Alternatively, you can do it with awk
:
awk '{$1=""; $2=""; sub(" ", " "); print}' input_filename > output_filename
Explanation:
awk
: invoke the awk command$1=""; $2="";
: set field 1 and 2 to the empty stringsub(...);
: clean up the output fields because fields 1 & 2 will still be delimited by " "print
: print the modified lineinput_filename > output_filename
: same as above.
Here's one way to do it with Awk that's relatively easy to understand:
awk '{print substr($0, index($0, $3))}'
This is a simple awk command with no pattern, so action inside {}
is run for every input line.
The action is to simply prints the substring starting with the position of the 3rd field.
$0
: the whole input line$3
: 3rd fieldindex(in, find)
: returns the position offind
in stringin
substr(string, start)
: return a substring starting at indexstart
If you want to use a different delimiter, such as comma, you can specify it with the -F option:
awk -F"," '{print substr($0, index($0, $3))}'
You can also operate this on a subset of the input lines by specifying a pattern before the action in {}
. Only lines matching the pattern will have the action run.
awk 'pattern{print substr($0, index($0, $3))}'
Where pattern can be something such as:
/abcdef/
: use regular expression, operates on $0 by default.$1 ~ /abcdef/
: operate on a specific field.$1 == blabla
: use string comparisonNR > 1
: use record/line numberNF > 0
: use field/column number
Thanks for posting the question. I'd also like to add the script that helped me.
awk '{ $1=""; print $0 }' file
awk '{$1=$2="";$0=$0;$1=$1}1'
Input
a b c d
Output
c d
You can use sed
:
sed 's/^[^ ][^ ]* [^ ][^ ]* //'
This looks for lines starting with one-or-more non-blanks, a blank, another set of one-or-more non-blanks and another blank, and deletes the matched material, aka the first two fields. The [^ ][^ ]*
is marginally shorter than the equivalent but more explicit [^ ]\{1,\}
notation, and the second might run into issues with GNU sed
(though if you use --posix
as an option, even GNU sed
can't screw it up). OTOH, if the character class to be repeated was more complex, the numbered notation wins for brevity. It is easy to extend this to handle 'blank or tab' as separator, or 'multiple blanks' or 'multiple blanks or tabs'. It could also be modified to handle optional leading blanks (or tabs) before the first field, etc.
For awk
and cut
, see Sampson-Chen's answer. There are other ways to write the awk
script, but they're not materially better than the answer given. Note that you might need to set the field separator explicitly (-F" "
) in awk
if you do not want tabs treated as separators, or you might have multiple blanks between fields. The POSIX standard cut
does not support multiple separators between fields; GNU cut
has the useful but non-standard -i
option to allow for multiple separators between fields.
You can also do it in pure shell:
while read junk1 junk2 residue
do echo "$residue"
done < in-file > out-file
Its pretty straight forward to do it with only shell
while read A B C; do
echo "$C"
done < oldfile >newfile
perl:
perl -lane 'print join(' ',@F[2..$#F])' File
awk:
awk '{$1=$2=""}1' File
This might work for you (GNU sed):
sed -r 's/^([^ ]+ ){2}//' file
or for columns separated by one or more white spaces:
sed -r 's/^(\S+\s+){2}//' file
Use kscript
kscript 'lines.split().select(-1,-2).print()' file
Using awk, and based in some of the options below, using a for loop makes a bit more flexible; sometimes I may want to delete the first 9 columns ( if I do an "ls -lrt" for example), so I change the 2 for a 9 and that's it:
awk '{ for(i=0;i++<2;){$i=""}; print $0 }' your_file.txt
来源:https://stackoverflow.com/questions/13446255/how-to-remove-the-first-two-columns-in-a-file-using-shell-awk-sed-whatever