Concatenate csv files in powershell, without the first line (except for the first file)

后端 未结 2 1899
别跟我提以往
别跟我提以往 2021-01-15 20:34

I have multiple *.csv files. I want to concatenate them into a single CSV file in a powershell script. All csv files have the same header (the first line), so when I concate

相关标签:
2条回答
  • 2021-01-15 20:59

    Note: The solution in this answer intentionally uses plain-text processing to process the files, for two reasons:

    • Use of Import-Csv and Export-Csv incurs significant processing overhead (though that may not matter in a given situation); plain-text processing is significantly faster.

    • In Windows PowerShell and PowerShell [Core] 6.x, the output will invariably have double-quoted column values, even if they weren't initially (though that should normally not matter).

      • In PowerShell [Core] 7.0+ Export-Csv and ConvertTo-Csv now have a -UseQuotes parameter that allows you to control quoting in the output.

    That said, Import-Csv and Export-Csv are certainly the better choice whenever you need to read and interpret the data (as opposed to just copying it elsewhere) - see Sid's helpful answer.


    # The single output file.
    # Note: Best to save this in a different folder than the input
    #       folder, in case you need to run multiple times.
    $outFile = 'outdir/out.csv'
    
    # Get all input CSV files as an array of file-info objects,
    # from the current dir. in this example
    $inFiles = @(Get-ChildItem -Filter *.csv)
    
    # Extract the header line (column names) from the first input file
    # and write it to the output file.
    Get-Content $inFiles[0] -First 1 | Set-Content -Encoding Utf8 $outFile
    
    # Process all input files and append their *data* rows to the
    # output file (that is, skip the header row).
    # NOTE: If you only wanted to extract a given count $count of data rows
    #       from each file, add -First ($count+1) to the Get-Content call.
    foreach ($file in $inFiles) {
      Get-Content $_.FullName | Select-Object -Skip 1 | 
        Set-Content -Append -Encoding Utf8 $outFile 
    }
    

    Note the use of -Encoding Utf8 as an example; adjust as needed; by default, Set-Content will use "ANSI" encoding in Windows PowerShell, and BOM-less UTF-8 in PowerShell Core.

    Caveat: By doing line-by-line plain-text processing, you're relying on each text line representing a single CSV data row; this is typically true, but doesn't have to be.

    Conversely, if performance is paramount, the plain-text approach above could be made significantly faster with direct use of .NET methods such as [IO.File]::ReadLines() or, if the files are small enough, even [IO.File]::ReadAllLines().

    0 讨论(0)
  • 2021-01-15 21:01

    You could have done like this:

    (Get-ChildItem -Path $path -Filter *.csv).FullName | Import-Csv | Export-Csv $path\concatenated.csv -NoTypeInformation
    

    Where $path is the folder where the csv files exist. The final csv file will be in the same folder.

    0 讨论(0)
提交回复
热议问题