Say I have a large file with many rows and many columns. I\'d like to find out how many rows and columns I have using bash.
If counting number of columns in the first is enough, try the following:
awk -F'\t' '{print NF; exit}' myBigFile.tsv
where \t
is column delimiter.
head -1 file.tsv |head -1 train.tsv |tr '\t' '\n' |wc -l
take the first line, change tabs (or you can use ',' instead of '\t' for commas), count the number of lines.
Perl solution:
perl -ane '$maxc = $#F if $#F > $maxc; END{$maxc++; print "max columns: $maxc\nrows: $.\n"}' file
If your input file is comma-separated:
perl -F, -ane '$maxc = $#F if $#F > $maxc; END{$maxc++; print "max columns: $maxc\nrows: $.\n"}' file
output:
max columns: 5
rows: 2
-a
autosplits input line to @F
array
$#F
is the number of columns -1
-F,
field separator of , instead of whitespace
$.
is the line number (number of rows)
Little twist to kirill_igum's answer, and you can easily count the number of columns of any certain row you want, which was why I've come to this question, even though the question is asking for the whole file. (Though if your file has same columns in each line this also still works of course):
head -2 file |tail -1 |tr '\t' '\n' |wc -l
Gives the number of columns of row 2. Replace 2 with 55 for example to get it for row 55.
-bash-4.2$ cat file
1 2 3
1 2 3 4
1 2
1 2 3 4 5
-bash-4.2$ head -1 file |tail -1 |tr '\t' '\n' |wc -l
3
-bash-4.2$ head -4 file |tail -1 |tr '\t' '\n' |wc -l
5
Code above works if your file is separated by tabs, as we define it to "tr". If your file has another separator, say commas, you can still count your "columns" using the same trick by simply changing the separator character "t" to ",":
-bash-4.2$ cat csvfile
1,2,3,4
1,2
1,2,3,4,5
-bash-4.2$ head -2 csvfile |tail -1 |tr '\,' '\n' |wc -l
2
If your file is big but you are certain that the number of columns remains the same for each row (and you have no heading) use:
head -n 1 FILE | awk '{print NF}'
to find the number of columns, where FILE is your file name.
To find the number of lines 'wc -l FILE' will work.
awk 'BEGIN{FS=","}END{print "COLUMN NO: "NF " ROWS NO: "NR}' file
You can use any delimiter as field separator and can find numbers of ROWS and columns