How to count related rows including subcategories?

陌路散爱 提交于 2020-06-16 17:24:13

问题


I've got a few tables in a Postgres 12.3 database.

The first one is named category:

id|template_id|name           |entry_count|is_base_template|can_rename|can_delete|section|userId|parentCategoryId|
--|-----------|---------------|-----------|----------------|----------|----------|-------|------|----------------|
 1|           |Notes          |          0|true            |true      |true      |A      |      |                |
 2|           |ToDo           |          0|true            |true      |true      |A      |      |                |
 3|          1|Notes          |          0|false           |true      |true      |A      |     1|                |
 4|          2|ToDo           |          0|false           |true      |true      |A      |     1|                |
 5|           |Must Do        |          0|false           |          |          |A      |      |               4|
 6|           |Important notes|          0|false           |          |          |A      |      |               3|

The second table is called entry - which has no bearing on the question at hand.

And there is the link table category_entries_entry:

categoryId|entryId|
----------|-------|
         4|      1|
         5|      5|
         5|      6|
         4|      7|
         3|      8|
         6|      9|

A category can possess children, if parentCategoryId is not NULL then we're dealing with a child. For instance, the category with id = 5 is a subcategory of id = 4. Children can't have their own children, so only one level of nesting.

I need to count the number of entries for each category including subcategories.

This request does what I need, mostly. But it does not take into account the user:

SELECT COALESCE(c."parentCategoryId" , c.id) as cat_id , COUNT(*) as entries_in_cat
FROM category c JOIN
     category_entries_entry r
     ON c.id = r."categoryId" 
WHERE c.is_base_template = false
GROUP BY cat_id;

Returns:

cat_id|entries_in_cat|
------|--------------|
     4|             4|
     3|             2|

The category table has also userId and the count should only be executed for a given user. Notably, only root categories have entries for userId.

And I want to list subcategories with their counts additionally. So the desired output with userId = 1 for the given sample is:

cat_id|entries_in_cat|
------|--------------|
     5|             2|
     4|             4|
     6|             1|
     3|             2|

Here's a break down:

1) Category number 6 is a subcategory of the 3rd category it has 1 entry, so the result is correct.

2) Category number 3 is a category (that's to say, it does not have a parent), it contains 1 entry and another one should come from the 6th subcategory, that's 2 in total. Your script returns 1 which is wrong.

3) Category number 5 is a subcategory of the 4th category, it contains 2 entries. Your script returns also 2 which is right.

4) category number 4 is a category, it has 2 entries of its own and another two come from the 5th subcategory, that's 4 in total. Your script returns 2 which is wrong. It should return 4.

How can I achieve that?


回答1:


This does the job for a single level of nesting:

To list only root categories, counts include subcategories:

WITH root AS (
   SELECT id AS cat_id, id AS sub_id
   FROM   category
   WHERE  is_base_template = false
   AND    "userId" = 1
   )
SELECT c.cat_id, count(*)::int AS entries_in_cat
FROM  (
   TABLE root
   UNION ALL
   SELECT r.cat_id, c.id
   FROM   root     r
   JOIN   category c ON c."parentCategoryId" = r.cat_id
   ) c
JOIN   category_entries_entry e ON e."categoryId" = c.sub_id
GROUP  BY c.cat_id;

The point is to join on sub_id, but group by cat_id.

To list root categories like above, and subcategories additionally:

WITH root AS (
   SELECT id AS cat_id, id AS sub_id
   FROM   category
   WHERE  is_base_template = false
   AND    "userId" = 1
   )
, ct AS (
   SELECT c.cat_id, c.sub_id, count(*)::int AS ct
   FROM  (
      TABLE root
      UNION ALL
      SELECT r.cat_id, c.id AS sub_id
      FROM   root     r
      JOIN   category c ON c."parentCategoryId" = r.cat_id
      ) c
   JOIN   category_entries_entry e ON e."categoryId" = c.sub_id
   GROUP  BY c.cat_id, c.sub_id
   )
SELECT cat_id, sum(ct)::int AS entries_in_cat
FROM   ct
GROUP  BY 1

UNION ALL
SELECT sub_id, ct
FROM   ct
WHERE  cat_id <> sub_id;

db<>fiddle here

For an arbitrary number of nesting levels, use a recursive CTE. Example:

  • How to link rows in a many-to-many relationship where types have parents, grandparents etc

About the optional short syntax TABLE parent:

  • Is there a shortcut for SELECT * FROM?



回答2:


Since the filtering applies to the parent categories only you'll need to apply it there first: then you can get the subcategories. Once you have the parent categories and the subcategories, you can join them to the entries and count them.

In sum, the query should look like:

with
c as (
  SELECT id
  FROM category
  WHERE userId = 1 AND is_base_template = false
),
s as (
  SELECT d.id
  FROM c
  JOIN category d on d.parentCategoryId = c.id
)
SELECT u.id, count(*) as entries_in_cat
FROM (select id from c union select id from s) u
JOIN category_entries_entry r ON u.id = r."categoryId" 
GROUP BY u.id


来源:https://stackoverflow.com/questions/62350754/how-to-count-related-rows-including-subcategories

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!