Compare each column's contents with all other columns' contents and present matrix of match counts

纵然是瞬间 提交于 2020-01-07 03:58:26

问题


Given this table:

I'd like to derive this table:

...sort of like a mileage chart in a map book.

I'm trying to create a cross-table comparison of the words in each of the columns, against all of the other columns' words, to show how many matches there are between them.

For instance, comparing Column 1 against Column2 might yield 4 matches. The yellow, bold outlined cells are the matches.

And here's how I count them:

I'm thinking there might be an 'easy' way to accomplish this using Power Query. Is there?

(Oh...and by the way...the solution I'm looking for should not expect a static number of input columns: i.e., it should accommodate for more columns or less columns to be used in the input comparison set.)

Thanks.


回答1:


No, there is no easy way, but it can be done. However, I get different results. My interpretation of your logic is: for each column combination, the number of occurrences of each common word in 1 column must be multiplied with the number of occurrences in the other column. These are my results:

And this is my query code:

let
    Source = Table1,
    ColumnNames = Table.ColumnNames(Source),
    Tabled = Table.FromColumns({ColumnNames}, type table[Columns = text]),
    AddedColumns2 = Table.AddColumn(Tabled, "Columns2", each ColumnNames, type {text}),
    ExpandedColumns2 = Table.ExpandListColumn(AddedColumns2, "Columns2"),
    CommonWords = 
        Table.AddColumn(ExpandedColumns2, 
                        "DistinctIntersect", 
                        each if [Columns] = [Columns2]
                           then {} 
                           else List.Distinct(List.Intersect({Table.Column(Source,[Columns]),
                                                              Table.Column(Source,[Columns2])}))),
    AddedCount = 
        Table.AddColumn(CommonWords,
                        "Count", 
                        (This) => List.Sum({0}&List.Transform(This[DistinctIntersect],
                                                   each List.Count(List.PositionOf(Table.Column(Source,This[Columns]),_,2)) *
                                                        List.Count(List.PositionOf(Table.Column(Source,This[Columns2]),_,2)))),
                       Int64.Type),
    RemovedColumns = Table.RemoveColumns(AddedCount,{"DistinctIntersect"}),
    PivotedColumn = Table.Pivot(RemovedColumns, List.Distinct(RemovedColumns[Columns2]), "Columns2", "Count")
in
    PivotedColumn


来源:https://stackoverflow.com/questions/44395636/compare-each-columns-contents-with-all-other-columns-contents-and-present-matr

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!