i am using UTL_FILE
utility in oracle to get the data in to csv file. here i am using the script.
so i am getting the set of text files
case:1
egrep -c test1.csv
doesn't have a search term to match for, so it's going to try to use test1.csv
as the regular expression it tries to search for. I have no idea how you managed to get it to return 2 for your first example.
A useable egrep
command that will actually produce the number of records in the files is egrep '"[[:digit:]]*"' test1.csv
assuming your examples are actually accurate.
timp@helez:~/tmp$ cat test.txt
"sno","name"
"1","hari is in singapore
ramesh is in USA"
"2","pong is in chaina
chang is in malaysia
vilet is in uk"
timp@helez:~/tmp$ egrep -c '"[[:digit:]]*"' test.txt
2
timp@helez:~/tmp$ cat test2.txt
"sno","name"
"1","hari is in singapore"
"2","ramesh is in USA"
timp@helez:~/tmp$ egrep -c '"[[:digit:]]*"' test2.txt
2
Alternatively you might do better to add an extra value to your SELECT
statement. Something like SELECT 'recmatch.,.,',sno,name FROM TABLE;
instead of SELECT sno,name FROM TABLE;
and then grep
for recmatch.,.,
though that's something of a hack.
In your second example your lines do not start with "
followed by a number. That's why count is 0
. You can try egrep -c "^\"([0-9]|\")"
to catch empty first column values. But in fact it might be simpler to count all lines and remove 1
because of the header row.
e.g.
count=$(( $(wc -l test.csv) - 1 ))
Columns in both case is different. To make it generic I wrote a perl script which will print the rows. It generates the regex from headers and used it to calculate the rows. I assumed that first line always represents the number of columns.
#!/usr/bin/perl -w
open(FH, $ARGV[0]) or die "Failed to open file";
# Get coloms from HEADER and use it to contruct regex
my $head = <FH>;
my @col = split(",", $head); # Colums array
my $col_cnt = scalar(@col); # Colums count
# Read rest of the rows
my $rows;
while(<FH>) {
$rows .= $_;
}
# Create regex based on number of coloms
# E.g for 3 coloms, regex should be
# ".*?",".*?",".*?"
# this represents anything between " and "
my $i=0;
while($i < $col_cnt) {
$col[$i++] = "\".*?\"";
}
my $regex = join(",", @col);
# /s to treat the data as single line
# /g for global matching
my @row_cnt = $rows =~ m/($regex)/sg;
print "Row count:" . scalar(@row_cnt);
Just store it as row_count.pl
and run it as ./row_count.pl filename