Find first non-blank cell for each column from rows that match lookup to merge duplicate matches

橙三吉。 提交于 2020-01-07 03:03:47

问题


I have a spreadsheet that is combined contact lists from a number of sources, with varying degrees of completeness. Some contacts have provided their information multiple times, but left different fields blank each time. I'm looking to remove at least some of the duplicates by combining rows to fill in the blank values.

Here's a simplified example of what my data looks like

id    email                    phone    company    job title
01    johnsmith@example.com    5550123  acme inc   
01    johnsmith@example.com             acme inc   CEO
02    janedoe@gmail.com                 XYZ Co
02                             5555555
03    frank@school.edu
03    frank@school.edu                  school
03    frank@school.edu         5551111
04    bob@promoted.com                  job inc    VP Sales
04    bob@promoted.com                  job inc    mail clerk    

If I'm able to get all the blanks filled in then removing duplicates without losing data and dealing with cases like the last will be trivial. A solution that also de-dupes while not breaking or deleting case would also work.

I haven't found a way to get vlookup to loop through all matches until it finds a non-blank result, unless I sorted each column. Unfortunately my real data has 100+ columns and several thousand rows, so doing this manually for each column is impractical.


回答1:


Here is how to do it.

Assumes that:

  1. The data are in Sheet1, and that they start in cell A1.

  2. You have a blank Sheet2.

Procedure:

  1. Switch to Sheet2.

  2. In cell A1 enter this formula: =Sheet1!A1

  3. Copy that cell to B1 and to A2.

  4. In cell B2 enter this formula:

.

=IFERROR(INDEX(INDEX(Sheet1!B:B,MATCH(Sheet1!$A2,Sheet1!$A:$A,0)):INDEX(Sheet1!B:B,MATCH(Sheet1!$A2,Sheet1!$A:$A,1)),MATCH(TRUE,LEN(INDEX(Sheet1!B:B,MATCH(Sheet1!$A2,Sheet1!$A:$A,0)):INDEX(Sheet1!B:B,MATCH(Sheet1!$A2,Sheet1!$A:$A,1)))>0,0)),"")

This is an array formula and must be confirmed with Ctrl+Shift+Enter.

.

  1. Copy cell B2.

  2. Select the range B3 to B2000 (or however far down you need to go) and paste.

  3. Select all of column B by clicking the column B header at the top of the sheet.

  4. Copy column B.

  5. Select column C to Z (or however many you need) and paste.

That's it.

Tip: Once the formulas have done their thing, you can copy all of the cells on Sheet2 by clicking the tiny gray square at the extreme upper-left corner of the sheet and Paste Special as values. This will replace all the formulas with hard values on Sheet 2.




回答2:


This is based upon the principle that sorting a column of blanks and values will leave the blanks at the bottom. By cycling through the columns and sorting each section of like IDs, the non-blank values will rise to the top. These can easily be used to fill in cells in the lower ranks.

Sub fill_in_the_blanks()
    Dim rw As Long, cl As Long, id As Variant
    With Worksheets("Sheet1")
        With .Cells(1, 1).CurrentRegion
            .Cells.Sort Key1:=.Columns(1), Order1:=xlAscending, _
                        Orientation:=xlTopToBottom, Header:=xlYes
            For rw = 2 To .Rows.Count
                If id <> .Cells(rw, 1).Value2 Then
                    id = .Cells(rw, 1).Value2
                    With .Cells(rw, 1).Resize(Application.CountIf(.Columns(1), .Cells(rw, 1).Value), .Columns.Count)
                        For cl = 2 To .Columns.Count
                            .Cells.Sort Key1:=.Columns(1), Order1:=xlAscending, _
                                        Key2:=.Columns(cl), Order2:=xlAscending, _
                                        Orientation:=xlTopToBottom, Header:=xlNo
                            If CBool(Application.CountA(.Columns(cl))) And _
                             CBool(Application.CountBlank(.Columns(cl))) Then
                                With .Columns(cl).SpecialCells(xlCellTypeBlanks)
                                    .FormulaR1C1 = "=r[-1]c"
                                    .Value = .Value2
                                End With
                            End If
                        Next cl
                    End With
                End If
            Next rw
            .RemoveDuplicates Columns:=1, Header:=xlYes
        End With
    End With
End Sub

The area to be worked on is governed by the Range.CurrentRegion property. There should be no completely blank rows or columns segregating the 'island' of data.

        



来源:https://stackoverflow.com/questions/32639400/find-first-non-blank-cell-for-each-column-from-rows-that-match-lookup-to-merge-d

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!