MySQL: query with two many to many relations and duplicates

断了今生、忘了曾经 提交于 2021-02-11 07:35:30

问题


I have four models: articles, authors and tags. Each article can have many authors, and also can have many tags.

So my DB will have the following tables:

`article`
`article_author`
`author`
`article_tag`
`tags`

Here in MySQL:

DROP TABLE IF EXISTS article_tag;
DROP TABLE IF EXISTS article_author;
DROP TABLE IF EXISTS author;
DROP TABLE IF EXISTS tag;
DROP TABLE IF EXISTS article;

CREATE TABLE IF NOT EXISTS author (
  id INT(11) NOT NULL AUTO_INCREMENT,
  name VARCHAR(255),
  PRIMARY KEY (id)
);

CREATE TABLE IF NOT EXISTS article (
  id INT(11) NOT NULL AUTO_INCREMENT,
  title VARCHAR(255),
  PRIMARY KEY (id)
);

CREATE TABLE IF NOT EXISTS tag (
  id INT(11) NOT NULL AUTO_INCREMENT,
  tag VARCHAR(255),
  PRIMARY KEY (id)
);

CREATE TABLE IF NOT EXISTS article_author (
  article_id INT(11) NOT NULL,
  author_id INT(11) NOT NULL,
  PRIMARY KEY (article_id, author_id),
  INDEX fk_article_author_article_idx (article_id ASC) VISIBLE,
  INDEX fk_article_author_author_idx (author_id ASC) VISIBLE,
  CONSTRAINT fk_article_author_article
    FOREIGN KEY (article_id)
    REFERENCES article (id),
  CONSTRAINT fk_article_author_author
    FOREIGN KEY (author_id)
    REFERENCES author (id)
);

CREATE TABLE IF NOT EXISTS article_tag (
  article_id INT(11) NOT NULL,
  tag_id INT(11) NOT NULL,
  PRIMARY KEY (article_id, tag_id),
  INDEX fk_article_tag_article_idx (article_id ASC) VISIBLE,
  INDEX fk_article_tag_tag_idx (tag_id ASC) VISIBLE,
  CONSTRAINT fk_article_tag_article
    FOREIGN KEY (article_id)
    REFERENCES article (id),
  CONSTRAINT fk_article_tag_tag
    FOREIGN KEY (tag_id)
    REFERENCES tag (id)
);

And we can insert some data in our DB:

INSERT INTO article (id, title) VALUES (1, 'first article'), (2, 'second article'), (3, 'third article');
INSERT INTO author (id, name) VALUES (1, 'first author'), (2, 'second author'), (3, 'third author'), (4, 'fourth author');
INSERT INTO tag (id, tag) VALUES (1, 'first tag'), (2, 'second tag'), (3, 'third tag'), (4, 'fourth tag'), (5, 'fifth tag');
INSERT INTO article_tag (article_id, tag_id) VALUES (1, 1), (1, 2), (1, 3), (2, 2), (2, 4), (2, 5), (3, 1), (3, 2);
INSERT INTO article_author (article_id, author_id) VALUES (1, 1), (1, 2), (1, 3), (2, 2), (2, 4), (3, 1), (3, 2), (3, 3), (3, 4);

Now I want to retrieve the articles, and for every article I want the related author ids as well as tag ids:

SELECT 
  article.id, 
  article.title,
  JSON_ARRAYAGG(author.id) AS authors,
  JSON_ARRAYAGG(tag.id) AS tags
FROM article
INNER JOIN article_author ON article.id = article_author.article_id
INNER JOIN author ON article_author.author_id = author.id
INNER JOIN article_tag ON article.id = article_tag.article_id
INNER JOIN tag ON article_tag.tag_id = tag.id
GROUP BY article.id;

This is returning duplicates. Is not due to JSON_ARRAYAGG (we can replace if to COUNT and duplicates will still be there), but due to the double relation in the same query: if we remove either tags or authors from the query, the duplicates will dissapear. But I really would like to be able to be able to query multiple relations in same query.

How can I avoid those duplicates?


回答1:


I suspect you mean duplicates in the JSON fields. The problem is that you are joining along two different dimensions, so you get a Cartesian product for each article.

With some aggregation functions, you can just use DISTINCT to get around this. That option is not available for the JSON functions. Instead, you can use subqueries:

SELECT a.id, a.title,
       (SELECT JSON_ARRAYAGG(aa.author_id)
        FROM article_author aa 
        WHERE a.id = aa.article_id 
       ) as authors,
       (SELECT JSON_ARRAYAGG(art.tag_id)
        FROM article_tag art
        WHERE a.id = art.article_id 
       ) as tags
FROM article a;

Note that because you are only including the ids, you do not need to join to the base tables -- authors and tags. Of course, you can do that in the subquery if you want, but it is unnecessary.

Here is a db<>fiddle.



来源:https://stackoverflow.com/questions/62861992/mysql-query-with-two-many-to-many-relations-and-duplicates

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!