I have multiple *.csv files. I want to concatenate them into a single CSV file in a powershell script. All csv files have the same header (the first line), so when I concate
Note: The solution in this answer intentionally uses plain-text processing to process the files, for two reasons:
Use of Import-Csv
and Export-Csv
incurs significant processing overhead (though that may not matter in a given situation); plain-text processing is significantly faster.
In Windows PowerShell and PowerShell [Core] 6.x, the output will invariably have double-quoted column values, even if they weren't initially (though that should normally not matter).
Export-Csv
and ConvertTo-Csv
now have a -UseQuotes
parameter that allows you to control quoting in the output.That said, Import-Csv
and Export-Csv
are certainly the better choice whenever you need to read and interpret the data (as opposed to just copying it elsewhere) - see Sid's helpful answer.
# The single output file.
# Note: Best to save this in a different folder than the input
# folder, in case you need to run multiple times.
$outFile = 'outdir/out.csv'
# Get all input CSV files as an array of file-info objects,
# from the current dir. in this example
$inFiles = @(Get-ChildItem -Filter *.csv)
# Extract the header line (column names) from the first input file
# and write it to the output file.
Get-Content $inFiles[0] -First 1 | Set-Content -Encoding Utf8 $outFile
# Process all input files and append their *data* rows to the
# output file (that is, skip the header row).
# NOTE: If you only wanted to extract a given count $count of data rows
# from each file, add -First ($count+1) to the Get-Content call.
foreach ($file in $inFiles) {
Get-Content $_.FullName | Select-Object -Skip 1 |
Set-Content -Append -Encoding Utf8 $outFile
}
Note the use of -Encoding Utf8
as an example; adjust as needed; by default, Set-Content
will use "ANSI" encoding in Windows PowerShell, and BOM-less UTF-8 in PowerShell Core.
Caveat: By doing line-by-line plain-text processing, you're relying on each text line representing a single CSV data row; this is typically true, but doesn't have to be.
Conversely, if performance is paramount, the plain-text approach above could be made significantly faster with direct use of .NET methods such as [IO.File]::ReadLines() or, if the files are small enough, even [IO.File]::ReadAllLines().
You could have done like this:
(Get-ChildItem -Path $path -Filter *.csv).FullName | Import-Csv | Export-Csv $path\concatenated.csv -NoTypeInformation
Where $path
is the folder where the csv files exist. The final csv file will be in the same folder.