I have to merge multiple CSV files with same headers. I have to keep the header of the first file and remove headers of all the other files and merge them and create one mas
If Perl is an option:
perl -ne 'print if $. > 1 or ! $h; $h=1; close ARGV if eof' *.csv > master.csv
$.
is the line number.
It is NOT reset automatically between files, so close ARGV if eof
is needed.
$h
records if the header has already been printed.
awk 'FNR==1 && NR!=1{next;}{print}' *.csv
tested on solaris unix:
> cat file1.csv
Id,city,name ,location
1,NA,JACK,CA
>
> cat file2.csv
ID,city,name,location
2,NY,JERRY,NY
>
> nawk 'FNR==1 && NR!=1{next;}{print}' *.csv
Id,city,name ,location
1,NA,JACK,CA
2,NY,JERRY,NY
>
Explanation given by kevin-d:
FNR is the number of lines (records) read so far in the current file. NR is the number of lines read overall. So the condition 'FNR==1 && NR!=1{next;}' says, "Skip this line if it's the first line of the current file, and at least 1 line has been read overall." This has the effect of printing the CSV header of the first file while skipping it in the rest.
Link for the difference between awk and nawk
<?php
ini_set('auto_detect_line_endings', true);
$dir = "include/*.csv";
$returnVal = array();
foreach (glob($dir) as $file) {
$header = null;
$file = fopen($file, 'r') or die('Unable to open file!');
while(($row = fgetcsv($file)) !== false){
if($header === null){
$header = $row;
continue;
}
$newRow = array();
for($i = 0; $i<count($row); $i++){
$newRow[] = $row[$i];
}
if($newRow[0] == null)
break;
else
$returnVal[] = $newRow;
}
fclose($file);
}
//var_dump($returnVal);
$output = fopen("file.csv",'w') or die("Can't open output");
fputcsv($output, array('Date','close','open'));
foreach($returnVal as $product) {
fputcsv($output, $product);
}
fclose($output) or die("Can't close php://output"); ?>
Just as a side note for everyone who uses the accepted solution of this thread (like me as well :)) - be careful that this code will fail if the header contains new lines, i.e., something like
column1,"column\nwith\new line",column2
value1,value2,value3
...
In this case, only the part column1,"column
will be considered as the header and the rest of the header will be considered a normal row (which will completely break your final CSV). If you have a header with a new line inside, the only solution I can think about is to use a "full-fledged" csv reader library which will be able to correctly read the header.
But despite of this minor issue, the line above saved me from a lot of head ache. :D