CMD or Powershell command to combine (merge) corresponding lines from two files

前端 未结 6 1662
说谎
说谎 2020-11-29 12:56

Is it possible using CMD and Powershell to combine 2 files into 1 file like this:

file1-line1 tab file2-line1
file1-line2 tab file2-line2
file1-line3 tab file2-li         


        
相关标签:
6条回答
  • 2020-11-29 13:28

    Probably the simplest solution is to use a Windows port of the Linux paste utility (e.g. paste.exe from the UnxUtils):

    paste C:\path\to\file1.txt C:\path\to\file2.txt
    

    From the man page:

    DESCRIPTION

    Write lines consisting of the sequentially corresponding lines from each FILE, separated by TABs, to standard output.


    For a PowerShell(ish) solution, I'd use two stream readers:

    $sr1 = New-Object IO.StreamReader 'C:\path\to\file1.txt'
    $sr2 = New-Object IO.StreamReader 'C:\path\to\file2.txt'
    
    while ($sr1.Peek() -ge 0 -or $sr2.Peek() -ge 0) {
      if ($sr1.Peek() -ge 0) { $txt1 = $sr1.ReadLine() } else { $txt1 = '' }
      if ($sr2.Peek() -ge 0) { $txt2 = $sr2.ReadLine() } else { $txt2 = '' }
    
      "{0}`t{1}" -f $txt1, $txt2
    }
    

    This avoids having to read the two files entirely into memory before merging them, which bears the risk of memory exhaustion for large files.

    0 讨论(0)
  • 2020-11-29 13:30

    A generalized solution supporting multiple files, building on Ansgar Wiechers' great, memory-efficient System.IO.StreamReader solution:

    PowerShell's ability to invoke members (properties, methods) directly on a collection and have them automatically invoked on all items in the collection (member enumeration, v3+) allows for easy generalization:

    # Make sure .NET has the same current dir. as PS.
    [System.IO.Directory]::SetCurrentDirectory($PWD)
    
    # The input file paths.
    $files = 'file1', 'file2', 'file3'
    
    # Create stream-reader objects for all input files.
    $readers = [IO.StreamReader[]] $files
    
    # Keep reading while at least 1 file still has more lines.
    while ($readers.EndOfStream -contains $false) {
    
      # Read the next line from each stream (file).
      # Streams that are already at EOF fortunately just return "".
      $lines = $readers.ReadLine()
    
      # Output the lines separated with tabs.
      $lines -join "`t"
    
    }
    
    # Close the stream readers.
    $readers.Close()
    

    Get-MergedLines (source code below; invoke with -? for help) wraps the functionality in a function that:

    • accepts a variable number of filenames - both as an argument and via the pipeline

    • uses a configurable separator to join the lines (defaults to a tab)

    • allows trimming trailing separator instances

    function Get-MergedLines() {
    <#
    .SYNOPSIS
    Merges lines from 2 or more files with a specifiable separator (default is tab).
    
    .EXAMPLE
    > Get-MergedLines file1, file2 '<->'
    
    .EXAMPLE
    > Get-ChildItem file? | Get-MergedLines
    #>
      param(
        [Parameter(Mandatory, ValueFromPipeline, ValueFromPipelineByPropertyName)]
        [Alias('PSPath')]
        [string[]] $Path,
    
        [string] $Separator = "`t",
    
        [switch] $TrimTrailingSeparators
      )
    
      begin { $allPaths = @() }
    
      process { $allPaths += $Path }
    
      end {
    
        # Resolve all paths to full paths, which may include wildcard resolution.
        # Note: By using full paths, we needn't worry about .NET's current dir.
        #       potentially being different.
        $fullPaths = (Resolve-Path $allPaths).ProviderPath
    
        # Create stream-reader objects for all input files.
        $readers = [System.IO.StreamReader[]] $fullPaths
    
        # Keep reading while at least 1 file still has more lines.
        while ($readers.EndOfStream -contains $false) {
    
          # Read the next line from each stream (file).
          # Streams that are already at EOF fortunately just return "".
          $lines = $readers.ReadLine()
    
          # Join the lines.
          $mergedLine = $lines -join $Separator
    
          # Trim (remove) trailing separators, if requested.
          if ($TrimTrailingSeparators) {
            $mergedLine = $mergedLine -replace ('^(.*?)(?:' + [regex]::Escape($Separator) + ')+$'), '$1'
          }
    
          # Output the merged line.
          $mergedLine
    
        }
    
        # Close the stream readers.
        $readers.Close()
    
      }
    
    }
    
    0 讨论(0)
  • 2020-11-29 13:31

    In PowerShell, and assuming both files have exactly the same number of lines:

    $f1 = Get-Content file1
    $f2 = Get-Content file2
    
    for ($i = 0; $i -lt $f1.Length; ++$i) {
      $f1[$i] + "`t" + $f2[$i]
    }
    
    0 讨论(0)
  • 2020-11-29 13:31

    Powershell solution:

    $file1 = Get-Content file1
    $file2 = Get-Content file2
    $outfile = "file3.txt"
    
    for($i = 0; $i -lt $file1.length; $i++) {
      "$($file1[$i])`t$($file2[$i])" | out-file $outfile -Append 
    }
    
    0 讨论(0)
  • 2020-11-29 13:35
    @echo off
    setlocal EnableDelayedExpansion
    rem Next line have a tab after the equal sign:
    set "TAB=   "
    Rem First file is read with FOR /F command
    Rem Second file is read via Stdin
    < file2.txt (for /F "delims=" %%a in (file1.txt) do (
       Rem Read next line from file2.txt
       set /P "line2="
       Rem Echo lines of both files separated by tab
       echo %%a%TAB%!line2!
    ))
    

    Further details at this post

    0 讨论(0)
  • 2020-11-29 13:43

    There are a number of recent locked [duplicate] questions that link into this question like:

    • Merging two csvs into one with columns [duplicate]
    • Merge 2 csv files in powershell [duplicate]

    were I do not agree with because they differ in a way that this question concerns text files and the other concern csv files. As a general rule, I would advice against manipulating files that represent objects (like xml, json and csv). Instead, I recommend to import these files (to objects), make the concerned changes and ConvertTo/Export the results back to a file.

    One example where all the given general solutions in this issue will result in an incorrect output for these "duplicates" is where e.g. both csv files have a common column (property) name.
    The general Join-Object (see also: In Powershell, what's the best way to join two tables into one?) will join two objects list when the -on parameter is simply omitted. Therefor this solution will better fit the other (csv) "duplicate" questions. Take Merge 2 csv files in powershell [duplicate] from @Ender as an example:

    $A = ConvertFrom-Csv @'
    ID,Name
    1,Peter
    2,Dalas
    '@
    
    $B = ConvertFrom-Csv @'
    Class
    Math
    Physic
    '@
    
    $A | Join $B
    
    ID Name  Class
    -- ----  -----
    1  Peter Math
    2  Dalas Physic
    

    In comparison with the "text" merge solutions given in this answer, the general Join-Object cmdlet is able to deal with different file lengths, and let you decide what to include (LeftJoin, RightJoin or FullJoin). Besides you have control over which columns you what to include ($A | Join $B -Property ID, Name) the order ($A | Join $B -Property ID, Class, Name) and a lot more which cannot be done which just concatenating text.

    Specific to this question:

    As this specific question concerns text files rather then csv files, you will need to ad a header (property) name (e.g.-Header File1) while imparting the file and remove the header (Select-Object -Skip 1) when exporting the result:

    $File1 = Import-Csv .\File1.txt -Header File1 
    $File2 = Import-Csv .\File2.txt -Header File2
    $File3 = $File1 | Join $File2
    $File3 | ConvertTo-Csv -Delimiter "`t" -NoTypeInformation |
        Select-Object -Skip 1 | Set-Content .\File3.txt
    
    0 讨论(0)
提交回复
热议问题