Powerquery, does string contain an item in a list

房东的猫 提交于 2021-02-11 16:11:28


I would like to filter on whether multiple text columns ([Name], [GenericName], or [SimpleGenericName]) contains a substring from a list. The text is also mixed case so I need to do a Text.Lower([Column]) in there as well.

I've tried the formula:

= Table.SelectRows(#"Sorted Rows", each List.Contains(MED_NAME_LIST, Text.Lower([Name])))

However, this does not work as the Column [Name] does not exactly match those items in the list (e.g. it won't pick up "Methylprednisolone Tab" if the list contains "methylprednisolone")

An example of a working filter, with all some of the list written out is:

= Table.SelectRows(#"Sorted Rows", each Text.Contains(Text.Lower([Name]), "methylprednisolone") or Text.Contains(Text.Lower([Name]), "hydroxychloroquine") or Text.Contains(Text.Lower([Name]), "remdesivir") or Text.Contains(Text.Lower([GenericName]), "methylprednisolone") or Text.Contains(Text.Lower([GenericName]), "hydroxychloroquine") or Text.Contains([GenericName], "remdesivir") or Text.Contains(Text.Lower([SimpleGenericName]), "methylprednisolone") or Text.Contains(Text.Lower([SimpleGenericName]), "hydroxychloroquine") or Text.Contains([SimpleGenericName], "remdesivir"))

I would like to make this cleaner than having to write all of this out, as I would also like to be able to expand the list from a referenced table to make this a dynamic search.

Thank you in advance


If I have a list of medicines:

and I need to filter my table:

to only keep rows where certain columns (we'll specify which ones exactly later) contain case-insensitive, partial matches for any of the items in the above list of medicines, then one way to do this might be:

    MED_NAME_LIST = {"MEthYlprednisolone", "hYdroxychloroquine", "rEMdesivir"},
    initialTable = Table.FromRows({
        {"Methylprednisolone Tab", "train", "car", "bike"},
        {"no", "no", "no", "no"},
        {"tram", "teleport", "hydroxychloroQuine Tab", "jet"},
        {"no", "no", "no", "yes"},
        {"REMdesivir Tab", "bus", "taxi", "concord"}
    }, type table [Name = text, GenericName = text, SimpleGenericName = text, SomeOtherColumn = text]),
    filtered = Table.SelectRows(initialTable, each List.ContainsAny(
        {[Name], [GenericName], [SimpleGenericName]},
        (rowValue as text, medicineFromList as text) as logical => Text.Contains(rowValue, medicineFromList, Comparer.OrdinalIgnoreCase)
  • In filtered, List.ContainsAny is used to determine if any of the specified columns (Name, GenericName, SimpleGenericName) contain a "match" for any of the values in MED_NAME_LIST.
  • The criteria for the "match" is that:
    • case sensitivity must be ignored (hence Comparer.OrdinalIgnoreCase is used)
    • the match must be partial (hence Text.Contains is used)

The above code gives me the following, which I believe is the filtering behaviour you described:

