Count matches between multiple columns and words in a nested array

前端 未结 1 1087
无人共我
无人共我 2021-01-29 05:34

My earlier question was resolved. Now I need to develop a related, but more complex query.

I have a table like this:

id     description          addition         


        
相关标签:
1条回答
  • 2021-01-29 06:05

    The answer isn't simple, but figuring out what you are asking was harder:

    SELECT row_number() OVER (ORDER BY t.id) AS id
         , t.id AS "RID"
         , count(DISTINCT a.ord) AS "Matches"
    FROM   tbl t
    LEFT   JOIN (
       unnest(array_content) WITH ORDINALITY x(elem, ord)
       CROSS JOIN LATERAL
       unnest(string_to_array(elem, ',')) txt
       ) a ON t.description ~ a.txt
           OR t.additional_info ~ a.txt
    GROUP  BY t.id;
    

    Produces your desired result exactly.
    array_content is your array of search terms.

    How does this work?

    Each array element of the outer array in your search term is a comma-separated list. Decompose the odd construct by unnesting twice (after transforming each element of the outer array into another array). Example:

    SELECT *
    FROM   unnest('{"Festivals,games","sport,swim"}'::varchar[]) WITH ORDINALITY x(elem, ord)
    CROSS  JOIN LATERAL
           unnest(string_to_array(elem, ',')) txt;
    

    Result:

     elem            | ord |  txt
    -----------------+-----+------------
     Festivals,games | 1   | Festivals
     Festivals,games | 1   | games
     sport,swim      | 2   | sport
     sport,swim      | 2   | swim
    

    Since you want to count matches for each outer array element once, we generate a unique number on the fly with WITH ORDINALITY. Details:

    • PostgreSQL unnest() with element number

    Now we can LEFT JOIN to this derived table on the condition of a desired match:

       ... ON t.description ~ a.txt
           OR t.additional_info ~ a.txt
    

    .. and get the count with count(DISTINCT a.ord), counting each array only once even if multiple search terms match.

    Finally, I added the mysterious id in your result with row_number() OVER (ORDER BY t.id) AS id - assuming it's supposed to be a serial number. Voilá.

    The same considerations for regular expression matches (~) as in your previous question apply:

    • Postgres query to calculate matching strings
    0 讨论(0)
提交回复
热议问题