How to do a Postgresql subquery in select clause with join in from clause like SQL Server?

前端 未结 5 1975
野性不改
野性不改 2021-01-30 04:00

I am trying to write the following query on postgresql:

select name, author_id, count(1), 
    (select count(1)
    from names as n2
    where n2.id = n1.id
             


        
相关标签:
5条回答
  • 2021-01-30 04:12

    I'm not sure I understand your intent perfectly, but perhaps the following would be close to what you want:

    select n1.name, n1.author_id, count_1, total_count
      from (select id, name, author_id, count(1) as count_1
              from names
              group by id, name, author_id) n1
    inner join (select id, author_id, count(1) as total_count
                  from names
                  group by id, author_id) n2
      on (n2.id = n1.id and n2.author_id = n1.author_id)
    

    Unfortunately this adds the requirement of grouping the first subquery by id as well as name and author_id, which I don't think was wanted. I'm not sure how to work around that, though, as you need to have id available to join in the second subquery. Perhaps someone else will come up with a better solution.

    Share and enjoy.

    0 讨论(0)
  • 2021-01-30 04:14

    Complementing @Bob Jarvis and @dmikam answer, Postgres don't perform a good plan when you don't use LATERAL, below a simulation, in both cases the query data results are the same, but the cost are very different

    Table structure

    CREATE TABLE ITEMS (
        N INTEGER NOT NULL,
        S TEXT NOT NULL
    );
    
    INSERT INTO ITEMS
      SELECT
        (random()*1000000)::integer AS n,
        md5(random()::text) AS s
      FROM
        generate_series(1,1000000);
    
    CREATE INDEX N_INDEX ON ITEMS(N);
    

    Performing JOIN with GROUP BY in subquery without LATERAL

    EXPLAIN 
    SELECT 
        I.*
    FROM ITEMS I
    INNER JOIN (
        SELECT 
            COUNT(1), n
        FROM ITEMS
        GROUP BY N
    ) I2 ON I2.N = I.N
    WHERE I.N IN (243477, 997947);
    

    The results

    Merge Join  (cost=0.87..637500.40 rows=23 width=37)
      Merge Cond: (i.n = items.n)
      ->  Index Scan using n_index on items i  (cost=0.43..101.28 rows=23 width=37)
            Index Cond: (n = ANY ('{243477,997947}'::integer[]))
      ->  GroupAggregate  (cost=0.43..626631.11 rows=861418 width=12)
            Group Key: items.n
            ->  Index Only Scan using n_index on items  (cost=0.43..593016.93 rows=10000000 width=4)
    

    Using LATERAL

    EXPLAIN 
    SELECT 
        I.*
    FROM ITEMS I
    INNER JOIN LATERAL (
        SELECT 
            COUNT(1), n
        FROM ITEMS
        WHERE N = I.N
        GROUP BY N
    ) I2 ON 1=1 --I2.N = I.N
    WHERE I.N IN (243477, 997947);
    

    Results

    Nested Loop  (cost=9.49..1319.97 rows=276 width=37)
      ->  Bitmap Heap Scan on items i  (cost=9.06..100.20 rows=23 width=37)
            Recheck Cond: (n = ANY ('{243477,997947}'::integer[]))
            ->  Bitmap Index Scan on n_index  (cost=0.00..9.05 rows=23 width=0)
                  Index Cond: (n = ANY ('{243477,997947}'::integer[]))
      ->  GroupAggregate  (cost=0.43..52.79 rows=12 width=12)
            Group Key: items.n
            ->  Index Only Scan using n_index on items  (cost=0.43..52.64 rows=12 width=4)
                  Index Cond: (n = i.n)
    

    My Postgres version is PostgreSQL 10.3 (Debian 10.3-1.pgdg90+1)

    0 讨论(0)
  • 2021-01-30 04:20
    select n1.name, n1.author_id, cast(count_1 as numeric)/total_count
      from (select id, name, author_id, count(1) as count_1
              from names
              group by id, name, author_id) n1
    inner join (select distinct(author_id), count(1) as total_count
                  from names) n2
      on (n2.author_id = n1.author_id)
    Where true
    

    used distinct if more inner join, because more join group performance is slow

    0 讨论(0)
  • 2021-01-30 04:28

    I am just answering here with the formatted version of the final sql I needed based on Bob Jarvis answer as posted in my comment above:

    select n1.name, n1.author_id, cast(count_1 as numeric)/total_count
      from (select id, name, author_id, count(1) as count_1
              from names
              group by id, name, author_id) n1
    inner join (select author_id, count(1) as total_count
                  from names
                  group by author_id) n2
      on (n2.author_id = n1.author_id)
    
    0 讨论(0)
  • 2021-01-30 04:29

    I know this is old, but since Postgresql 9.3 there is an option to use a keyword "LATERAL" to use RELATED subqueries inside of JOINS, so the query from the question would look like:

    SELECT 
        name, author_id, count(*), t.total
    FROM
        names as n1
        INNER JOIN LATERAL (
            SELECT 
                count(*) as total
            FROM 
                names as n2
            WHERE 
                n2.id = n1.id
                AND n2.author_id = n1.author_id
        ) as t ON 1=1
    GROUP BY 
        n1.name, n1.author_id
    
    0 讨论(0)
提交回复
热议问题