Compare Two CSVs, match the columns on 2 or more Columns, export specific columns from both csvs with powershell

后端 未结 5 781
傲寒
傲寒 2021-01-24 06:33

i Have 2 CSV\'s

left.csv

Ref_ID,First_Name,Last_Name,DOB
321364060,User1,Micah,11/01/1969
946497594,User2,Acker,05/28/1960
887327716,User3,Aco,06/26/1950         


        
相关标签:
5条回答
  • 2021-01-24 07:12

    adding answer i found:

    $left = Import-Csv .\left.csv
    $right = Import-Csv .\right.csv
    
    $right | foreach { 
        $r = $_; 
        $left | where{ $_.First_Name -eq $r.First_Name -and $_.Last_Name -eq $r.Last_Name -and $_.DOB -eq $r.DOB } | 
            select Ref_Id, 
                First_Name, 
                Last_Name, 
                DOB, 
                @{Name="City";Expression={$r.City}}, 
                @{Name="Document_Type";Expression={$r.Document_Type}}, 
                @{Name="FileName";Expression={$r.FileName}}
    } | format-table
    
    0 讨论(0)
  • 2021-01-24 07:22

    You could create you own key from each csv, then add from each csv to a new hashtable using this key.

    Step through this in a debugger (ISE or VSCode) and tailor it to what you need... Add appropriate error checking as you need depending on the sanity of your data. Some statements below are just for debugging so you can inspect what's happening as it runs.

    # Ref_ID,First_Name,Last_Name,DOB
    $csv1 = @'
    321364060,User1,Micah,11/01/1969
    946497594,User2,Acker,05/28/1960
    887327716,User3,Aco,06/26/1950
    588496260,User4,John,05/23/1960
    565465465,User5,Jack,07/08/2020
    '@
    
    # First_Name,Last_Name,DOB,City,Document_Type,Filename
    $csv2 = @'
    User1,Micah,11/01/1969,Parker,Transcript,T4IJZSYO.pdf
    User2,Acker,05/28/1960,,Transcript,R4IKTRYN.pdf
    User3,Aco,06/26/1950,,Transcript,R4IKTHMK.pdf
    User4,John,05/23/1960,,Letter,R4IKTHSL.pdf
    '@
    
    # hashtable
    $data = @{}
    
    $c1 = $csv1 -split "`r`n"
    $c1.count
    
    foreach ($item in $c1)
    {
        $fields = $item -split ','
        $key = $fields[1]+$fields[2]+$fields[3]
        $key
    
        # add new hashtable for given key
        $data.Add($key, [ordered]@{})
    
        # add data from c1 to the hashtable
        $data[$key].ID = $fields[0]
        $data[$key].First = $fields[1]
        $data[$key].Last = $fields[2]
        $data[$key].DOB = $fields[3]
    }
    
    $c2 = $csv2 -split "`r`n"
    $c2.count
    
    foreach ($item in $c2)
    {
        $fields = $item -split ','
        $key = $fields[0]+$fields[1]+$fields[2]
        $key
    
        # add data from c2 to the hashtable
        $data[$key].Type = $fields[4]
        $data[$key].FileName = $fields[5]
    }
    
    $data.Count
    
    foreach ($key in $data.Keys)
    {
        '====================='
        $data[$key]
    }
    
    0 讨论(0)
  • 2021-01-24 07:22

    Try this Join-Object.
    It has a few more features along with joining based on multiple columns:

    $Left = ConvertFrom-Csv @"
    Ref_ID,First_Name,Last_Name,DOB
    321364060,User1,Micah,11/01/1969
    946497594,User2,Acker,05/28/1960
    887327716,User3,Aco,06/26/1950
    588496260,User4,John,05/23/1960
    565465465,User5,Jack,07/08/2020
    "@
    
    $Right = ConvertFrom-Csv @"
    First_Name,Last_Name,DOB,City,Document_Type,Filename
    User1,Micah,11/01/1969,Parker,Transcript,T4IJZSYO.pdf
    User2,Acker,05/28/1960,,Transcript,R4IKTRYN.pdf
    User3,Aco,06/26/1950,,Transcript,R4IKTHMK.pdf
    User4,John,05/23/1960,,Letter,R4IKTHSL.pdf
    "@
    
    $Left | Join $Right `
        -On First_Name, Last_Name, DOB `
        -Property Ref_ID, Filename, First_Name, DOB, Last_Name `
        | Format-Table
    
    Last_Name    Ref_ID DOB                    Filename     First_Name
    ---------    ------ ---                    --------     ----------
    Micah     321364060 1969-11-01 12:00:00 AM T4IJZSYO.pdf User1
    Acker     946497594 1960-05-28 12:00:00 AM R4IKTRYN.pdf User2
    Aco       887327716 1950-06-26 12:00:00 AM R4IKTHMK.pdf User3
    John      588496260 1960-05-23 12:00:00 AM R4IKTHSL.pdf User4
    
    0 讨论(0)
  • 2021-01-24 07:29
    $left = Import-Csv C:\left.csv
    $right = Import-Csv C:\right.csv
    
    Compare-Object -ReferenceObject $left -DifferenceObject $right -Property First_Name,Last_Name,DOB -IncludeEqual -ExcludeDifferent | 
        ForEach-Object {
            $iItem = $_
            $ileft = $left.Where({$_.First_Name -eq $iItem.First_Name -and $_.Last_Name -eq $iItem.Last_Name -and$_.DOB -eq $iItem.DOB})
            $iright = $right.Where({$_.First_Name -eq $iItem.First_Name -and $_.Last_Name -eq $iItem.Last_Name -and$_.DOB -eq $iItem.DOB})
            [pscustomobject]@{
                Ref_ID=$ileft.Ref_ID
                first_name=$ileft.first_name
                last_name=$ileft.last_name
                DOB=$ileft.DOB
                Document_Type=$iright.Document_Type
                Filename=$iright.Filename
            }
        } | Export-Csv C:\Combined.csv -NoTypeInformation
    
    0 讨论(0)
  • 2021-01-24 07:32

    Some good answers already, and here's another.

    Import your myriad objects into a single (dis)array:

    $left = @"
    Ref_ID,First_Name,Last_Name,DOB
    321364060,User1,Micah,11/01/1969
    946497594,User2,Acker,05/28/1960
    887327716,User3,Aco,06/26/1950
    588496260,User4,John,05/23/1960
    565465465,User5,Jack,07/08/2020
    "@
    
    $right = @"
    First_Name,Last_Name,DOB,City,Document_Type,Filename
    User1,Micah,11/01/1969,Parker,Transcript,T4IJZSYO.pdf
    User2,Acker,05/28/1960,,Transcript,R4IKTRYN.pdf
    User3,Aco,06/26/1950,,Transcript,R4IKTHMK.pdf
    User4,John,05/23/1960,,Letter,R4IKTHSL.pdf
    "@
    
    $disarray = @(
        $left | ConvertFrom-Csv 
        $right | ConvertFrom-Csv
    )
    

    Use Group-Object to organize them into groups having identical key values:

    $keyProps = @('First_Name', 'Last_name', 'DOB')
    $disarray | 
        Group-Object -Property $keyProps | 
        Where-Object Count -gt 1 |
    

    Then merge the objects, adding any missing properties to the output $mergedObject

        ForEach-Object {
            $mergedObject = $_.group[0]
            foreach ($obj in $_.group[1..($_.group.count-1)]) {
                $newProps = ($obj | Get-Member -MemberType NoteProperty).name | 
                    Where-Object {
                        $_ -notin ($mergedobject | Get-Member -MemberType NoteProperty).name
                    } 
                foreach ($propName in $newProps) {
                    $mergedObject | Add-Member -MemberType NoteProperty -Name $propName -Value $obj.$propName -Force
                }
            }
            Write-Output $mergedObject
        }
    

    This doesn't differ wildly from the answers you already have, but eliminating the "left" "right" distinction might be helpful; The above code should handle three or more sources thrown into $disarray, merging all objects containing identical $keyProps.

    Note that there are corner cases to consider. For instance, what happens if one object has 'City=Chigago' for a user and another has 'City=New York'?

    0 讨论(0)
提交回复
热议问题