MYSQL Query - Get latest comment related to the post

前端 未结 6 1004
刺人心
刺人心 2021-01-11 20:38

I am trying to get the latest 1 or 2 comments related to each post I download, a bit like instagram does as they show the latest 3 comments for each post, So far I am gettin

相关标签:
6条回答
  • 2021-01-11 21:15

    This type of comment has been posted many times, and trying to get the "latest-for-each" always appears to be a stumbling block and join / subquery nightmare for most.

    Especially for a web interface, you might be better to tack on a column (or 2 or 3) to the one table that is your active "posts" table such as Latest1, Latest2, Latest3.

    Then, via an insert into your comment table, have an insert trigger on your table to update the main post with the newest ID. Then you always have that ID on the table without any sub-joins. Now, as you mentioned, you might want to have the last 2 or 3 IDs, then add the 3 sample columns and have your insert trigger to the post comment detail do an update to the primary post table something like

    update PrimaryPostTable
       set Latest3 = Latest2,
           Latest2 = Latest1,
           Latest1 = NewDetailCommentID
       where PostID = PostIDFromTheInsertedDetail
    

    This would have to be formalized into a proper trigger under MySQL, but should be easy enough to implement. You could prime the list with the latest 1, then as new posts go, it would automatically roll the most recent into their 1st, 2nd, 3rd positions. Finally your query could be simplified down to something like

    Select
          P.PostID,
          P.TopicDescription,
          PD1.WhateverDetail as LatestDetail1,
          PD2.WhateverDetail as LatestDetail2,
          PD3.WhateverDetail as LatestDetail3
       from
          Posts P
             LEFT JOIN PostDetail PD1
                on P.Latest1 = PD1.PostDetailID
             LEFT JOIN PostDetail PD2
                on P.Latest2 = PD2.PostDetailID
             LEFT JOIN PostDetail PD3
                on P.Latest3 = PD3.PostDetailID
       where
          whateverCondition
    

    Denormalizing data is typically NOT desired. However, in cases such as this, it is a great simplifier for getting these "latest" entries in a For-Each type of query. Good luck.

    Here is a fully working sample in MySQL so you can see the tables and the results of the sql-inserts and the automatic stamping via the trigger to update the main post table. Then querying the post table you can see how the most recent automatically rolls into first, second and third positions. Finally a join showing how to pull all the data from each "post activity"

    CREATE TABLE Posts
    (   id int, 
        uuid varchar(7),
        imageLink varchar(9),
        `date` datetime,
        ActivityID1 int null,
        ActivityID2 int null,
        ActivityID3 int null,
        PRIMARY KEY (id)
    );
    
    CREATE TABLE Activity
    (   id int, 
        postid int,
        `type` varchar(40) collate utf8_unicode_ci, 
        commentText varchar(20) collate utf8_unicode_ci, 
        `date` datetime,
        PRIMARY KEY (id)
    );
    
    DELIMITER //
    
    CREATE TRIGGER ActivityRecAdded
    AFTER INSERT ON Activity FOR EACH ROW
    BEGIN
        Update Posts
            set ActivityID3 = ActivityID2,
                ActivityID2 = ActivityID1,
                ActivityID1 = NEW.ID
            where
                ID = NEW.POSTID;
    
    END; //
    
    DELIMITER ;
    
    
    
    INSERT INTO Posts
        (id, uuid, imageLink, `date`)
        VALUES
        (123, 'test1', 'blah', '2016-10-26 00:00:00');
    
    INSERT INTO Posts
        (id, uuid, imageLink, `date`)
        VALUES
        (125, 'test2', 'blah 2', '2016-10-26 00:00:00');
    
    
    INSERT INTO Activity
        (id, postid, `type`, `commentText`, `date`)
    VALUES
        (789, 123, 'type1', 'any comment', '2016-10-26 00:00:00'),
        (821, 125, 'type2', 'another comment', '2016-10-26 00:00:00'),
        (824, 125, 'type3', 'third comment', '2016-10-27 00:00:00'),
        (912, 123, 'typeAB', 'comment', '2016-10-27 00:00:00');
    
    -- See the results after the insert and the triggers.
    -- you will see that the post table has been updated with the 
    -- most recent 
    -- activity post ID=912 in position Posts.Activity1
    -- activity post ID=789 in position Posts.Activity2
    -- no value in position Posts.Activity3
    select * from Posts;
    
    -- NOW, insert two more records for post ID = 123.
    -- you will see the shift of ActivityIDs adjusted
    INSERT INTO Activity
        (id, postid, `type`, `commentText`, `date`)
    VALUES
        (931, 123, 'type1', 'any comment', '2016-10-28 00:00:00'),
        (948, 123, 'newest', 'blah', '2016-10-29 00:00:00');
    
    -- See the results after the insert and the triggers.
    -- you will see that the post table has been updated with the 
    -- most recent 
    -- activity post ID=948 in position Posts.Activity1
    -- activity post ID=931 in position Posts.Activity2
    -- activity post ID=912 in position Posts.Activity3
    -- notice the FIRST activity post 789 is not there as 
    -- anything AFTER the 4th entry, it got pushed away.
    select * from Posts;
    
    -- Finally, query the data to get the most recent 3 items for each post.
    select
            p.id,
            p.uuid,
            p.imageLink,
            p.`date`,
            A1.id NewestActivityPostID,
            A1.`type` NewestType,
            A1.`date` NewestDate,
            A2.id SecondActivityPostID,
            A2.`type` SecondType,
            A2.`date` SecondDate,
            A3.id ThirdActivityPostID,
            A3.`type` ThirdType,
            A3.`date` ThirdDate
        from
            Posts p
                left join Activity A1
                    on p.ActivityID1 = A1.ID
                left join Activity A2
                    on p.ActivityID2 = A2.ID
                left join Activity A3
                    on p.ActivityID3 = A3.ID;
    

    You can create a test database as to not corrupt yours to see this example.

    0 讨论(0)
  • 2021-01-11 21:17

    I am a little bit lost in your query, but if you want to download data for multiple posts at once, it's not a good idea to include comment data in the first query since you would include all the data about post and posting user multiple times. You should run another query that would connect posts with comments. Something like:

    SELECT 
    A.UUIDPost, 
    C.username,
    C.profileImage, 
    B.Comment,
    B.[DateField]
    FROM Posts A JOIN 
    Activities B ON A.uuid = B.UUIDPost JOIN
    Users C ON B.[UserId] = C.id 
    

    and use that data to display your comments with commenting user id, name, image etc.

    To get only 3 comments per post, you can look into this post:

    Select top 3 values from each group in a table with SQL

    if you are sure that there are going to be no duplicate rows in the comment table or this post:

    How to select top 3 values from each group in a table with SQL which have duplicates

    if you're not sure about that (although due to DateField in the table, it should not be possible).

    0 讨论(0)
  • 2021-01-11 21:24

    This will probably get rid of the illegal mix of collations... Just after establishing the connection, perform this query:

    SET NAMES utf8 COLLATE utf8_unicode_ci;
    

    For the question about the 'latest 2', please use the mysql commandline tool and run SHOW CREATE TABLE Posts and provide the output. (Ditto for the other relevant tables.) Phpmyadmin (and other UIs) have a way to perform the query without getting to a command line.

    0 讨论(0)
  • 2021-01-11 21:24

    You can get there with a pretty simple query by using sub-queries. First I specify the user in the where-clause and join the posts because it seems more logic to me. Then I get all the likes for a post with a sub-query.

    Now instead of grouping and limiting the group size we join only the values we want to by limiting the count of dates after the date we are currently looking at.

    INNER JOIN Activity if you only want to show posts with at least one comment.

    SELECT
      u.id,
      u.username,
      u.fullname,
      u.profileImage,
      p.uuid,
      p.caption,
      p.path,
      p.date,
      (SELECT COUNT(*) FROM Activity v WHERE v.uuidPost = p.uuidPost AND v.type = 'like') likes,
      a.commentText,
      a.date
    FROM
      Users u INNER JOIN
      Posts p ON p.id = u.id LEFT JOIN
      Activity a ON a.uuid = p.uuid AND a.type = 'comment' AND 2 > (
        SELECT COUNT(*) FROM Activity v
        WHERE v.uuid = p.uuid AND v.type = 'comment' AND v.date > a.date)
    WHERE
      u.id = 145
    


    That said a redesign would probably be best, also performance-wise (Activity will soon contain a lot of entries and they always have to be filtered for the desired type). The user table is okay with the id auto-incremented and as primary key. For the posts I would also add an auto-incremented id as primary key and user_id as foreign key (you can also decide what to do on deletion, e.g. with cascade all his posts would also be deleted automatically).

    For the comments and likes you can create separated tables with the two foreign keys user_id and post_id (simple example, like this you can only like posts and nothing else, but if there are not many different kind of likes it could still be good to create a post_likes and few other ..._likes tables, you have to think about how this data is usually queried, if those likes are mostly independent from each other it's probably a good choice).

    0 讨论(0)
  • 2021-01-11 21:31

    UNTESTED: I would recommend putting together an SQL fiddle with some sample data and your existing table structure showing the problem; that way we could play around with the responses and ensure functionality with your schema.

    So we use a variables to simulate a window function (Such as row_number)

    in this case @Row_num and @prev_Value. @Row_number keeps track of the current row for each post (since a single post could have lots of comments) then when the a new post ID (UUIDPOST?) is encountered the row_num variable is reset to 1. When the current records UUIDPOST matches the variable @prev_Value, we simply increment the row by 1.

    This technique allows us to assign a row number based on the date or activity ID order descending. As each cross join only results in 1 record we don't cause duplicate records to appear. However, since we then limit by row_number < = 2 we only get the two most recent comments in our newly added left join.

    This assumes posts relation to users is a Many to one, meaning a post can only have 1 user.

    Something like This: though I'm not sure about the final left join I need to better understand the structure of the activity table thus a comment against the original question.

    SELECT Posts.id,
            Posts.uuid,
            Posts.caption,
            Posts.path,
            Posts.date,
            USERS.id,
            USERS.username,
            USERS.fullname,
            USERS.profileImage,
            coalesce(A.LikeCNT,0)
            com.comment
        FROM Posts 
        INNER JOIN USERS 
          ON Posts.id = 145 
         AND USERS.id = 145
        LEFT JOIN (SELECT COUNT(A.uuidPost) LikeCNT, A.UUIDPost
            FROM Activity A
            WHERE type =  'like' 
            GROUP BY A.UUIDPOST) A
         on A.UUIDPost=Posts.uuid
    
    
      --This join simulates row_Number() over (partition by PostID, order by activityID desc)  (Nice article [here](http://preilly.me/2011/11/11/mysql-row_number/) several other examples exist on SO already.
       --Meaning.... Generate a row number for each activity from 1-X restarting at 1 for each new post but start numbering at the newest activityID)
    
        LEFT JOIN (SELECT comment, UUIDPOST, @row_num := IF(@prev_value=UUIDPOST,@row_num+1,1) as row_number,@prev_value := UUIDPOST
    
                   FROM ACTIVITY 
                   CROSS JOIN (SELECT @row_num := 1) x
                   CROSS JOIN (SELECT @prev_value := '') y
                   WHERE type = 'comment'
                   ORDER BY UUIDPOST, --Some date or ID desc) Com
           on Com.UUIIDPOSt = Posts.UUID
           and row_number < = 2
    
    
      -- Now since we have a row_number restarting at 1 for each new post, simply return only the 1st two rows.
    
        ORDER BY date DESC
        LIMIT 0, 5
    

    we had to put the and row_number < = 2 on the join itself. If it was put in the where clause you would lose those posts without any comments which I think you still want.

    Additionally we should probably look at the "comment" field to make sure it's not blank or null, but lets make sure this works first.

    0 讨论(0)
  • 2021-01-11 21:38

    This error message

    Illegal mix of collations (utf8_general_ci,IMPLICIT) and (utf8_unicode_ci,IMPLICIT) for operation '='

    is typically due to the definition of your columns and tables. It usually means that on either side of an equal sign there are different collations. What you need to do is choose one and include that decision in your query.

    The collation issue here was in the CROSS JOIN of @prev_value which needed an explicit collation to be used.

    I have also slightly changed the "row_number" logic to a single cross join and moved the if logic to the extremes of the select list.

    Some sample data is displayed below. Sample data is needed to test queries with. Anyone attempting to answer your question with working examples will need data. The reason I am including it here is twofold.

    1. so that you will understand any result I present
    2. so that in future when you ask another SQL related question you understand the importance of supplying data. It is not only more convenient for us that you do this. If the asker provides the sample data then the asker will already understand it - it won't be an invention of some stranger who has devoted some of their time to help out.

    Sample Data

    Please note some columns are missing from the tables, only the columns specified in the table details have been included.

    This sample data has 5 comments against a single post (no likes are recorded)

    CREATE TABLE Posts 
    (
    `id` int, 
    `uuid` varchar(7) collate utf8_unicode_ci,
    `imageLink` varchar(9) collate utf8_unicode_ci, 
    `date` datetime
     );
        
    INSERT INTO Posts(`id`, `uuid`, `imageLink`, `date`)
    VALUES
    (145, 'abcdefg', 'blah blah', '2016-10-10 00:00:00') ;
    
    CREATE TABLE   USERS
    (
    `id` int, 
    `username` varchar(15) collate utf8_unicode_ci,
     `profileImage` varchar(12) collate utf8_unicode_ci,
     `date` datetime
    ) ;
            
    INSERT INTO     USERS(`id`, `username`, `profileImage`, `date`)
    VALUES
    (145, 'used_by_already', 'blah de blah', '2014-01-03 00:00:00') ;
        
        
    CREATE TABLE Activity
    (
    `id` int, 
    `uuid` varchar(4) collate utf8_unicode_ci, 
    `uuidPost` varchar(7) collate utf8_unicode_ci,
     `type` varchar(40) collate utf8_unicode_ci, 
    `commentText` varchar(11) collate utf8_unicode_ci, `date` datetime
    ) ;
            
    INSERT INTO Activity (`id`, `uuid`, `uuidPost`, `type`, `commentText`, `date`)
     VALUES
    (345, 'a100', 'abcdefg', 'comment', 'lah lha ha', '2016-07-05 00:00:00'),
    (456, 'a101', 'abcdefg', 'comment', 'lah lah lah', '2016-07-06 00:00:00'),
    (567, 'a102', 'abcdefg', 'comment', 'lha lha ha', '2016-07-07 00:00:00'),
    (678, 'a103', 'abcdefg', 'comment', 'ha lah lah', '2016-07-08 00:00:00'),
    (789, 'a104', 'abcdefg', 'comment', 'hla lah lah', '2016-07-09 00:00:00') ;
    

    [SQL Standard behaviour: 2 rows per Post query]

    This was my initial query, with some corrections. I changed the column order of the select list so that you will see some comment related data easily when I present the results. Please study those results they are provided so you may understand what the query will do. Columns preceded by # do not exist in the sample data I am working with for reasons I have already noted.

    SELECT
          Posts.id
        , Posts.uuid
        , rcom.uuidPost
        , rcom.commentText
        , rcom.`date` commentDate 
        #, Posts.caption
        #, Posts.path
        , Posts.`date`
        , USERS.id
        , USERS.username
        #, USERS.fullname
        , USERS.profileImage
        , COALESCE(A.LikeCNT, 0) num_likes
    FROM Posts
    INNER JOIN USERS ON Posts.id = 145
                AND USERS.id = 145
    LEFT JOIN (
              SELECT
                    COUNT(A.uuidPost) LikeCNT
                  , A.UUIDPost
              FROM Activity A
              WHERE type = 'like'
              GROUP BY
                    A.UUIDPOST
              ) A ON A.UUIDPost = Posts.uuid 
    LEFT JOIN (
          SELECT
                @row_num := IF(@prev_value=UUIDPOST,@row_num+1,1) as row_number
              , commentText
              , uuidPost
              , `date`
              , @prev_value := UUIDPOST
          FROM Activity
          CROSS JOIN ( SELECT @row_num := 1, @prev_value := '' collate utf8_unicode_ci  ) xy
          WHERE type = 'comment'
          ORDER BY
                uuidPost
              , `date` DESC
          ) rcom ON rcom.uuidPost  = Posts.UUID
                AND rcom.row_number <= 2
    ORDER BY
          posts.`date` DESC
          ;
          
          
    

    See a working demonstration of this query at SQLFiddle

    Results:

    |  id |    uuid | uuidPost | commentText |                   date |                      date |  id |        username | profileImage | num_likes |
    |-----|---------|----------|-------------|------------------------|---------------------------|-----|-----------------|--------------|-----------|
    | 145 | abcdefg |  abcdefg | hla lah lah | July, 09 2016 00:00:00 | October, 10 2016 00:00:00 | 145 | used_by_already | blah de blah |         0 |
    | 145 | abcdefg |  abcdefg |  ha lah lah | July, 08 2016 00:00:00 | October, 10 2016 00:00:00 | 145 | used_by_already | blah de blah |         0 |
    

    There are 2 ROWS - as expected. One row for the most recent comment, and another rows for the next most recent comment. This is normal behaviour for SQL and until a comment was added under this answer readers of the question would assume this normal behaviour would be acceptable.

    The question lacks a clearly articulated "expected result".


    [Option 1: One row per Post query, with UP TO 2 comments, added columns]

    In a comment below it was revealed that you did not want 2 rows per post and this would be an easy fix. Well it kind of is easy BUT there are options and the options are dictated by the user in the form of requirements. IF the question had an "expected result" then we would know which option to choose. Nonetheless here is one option

    SELECT
          Posts.id
        , Posts.uuid
        , max(case when rcom.row_number = 1 then rcom.commentText end) Comment_one
        , max(case when rcom.row_number = 2 then rcom.commentText end) Comment_two
        #, Posts.caption
        #, Posts.path
        , Posts.`date`
        , USERS.id
        , USERS.username
        #, USERS.fullname
        , USERS.profileImage
        , COALESCE(A.LikeCNT, 0) num_likes
    FROM Posts
    INNER JOIN USERS ON Posts.id = 145
                AND USERS.id = 145
    LEFT JOIN (
              SELECT
                    COUNT(A.uuidPost) LikeCNT
                  , A.UUIDPost
              FROM Activity A
              WHERE type = 'like'
              GROUP BY
                    A.UUIDPOST
              ) A ON A.UUIDPost = Posts.uuid 
    LEFT JOIN (
          SELECT
                @row_num := IF(@prev_value=UUIDPOST,@row_num+1,1) as row_number
              , commentText
              , uuidPost
              , `date`
              , @prev_value := UUIDPOST
          FROM Activity
          CROSS JOIN ( SELECT @row_num := 1, @prev_value := '' collate utf8_unicode_ci  ) xy
          WHERE type = 'comment'
          ORDER BY
                uuidPost
              , `date` DESC
          ) rcom ON rcom.uuidPost  = Posts.UUID
                AND rcom.row_number <= 2
    GROUP BY
          Posts.id
        , Posts.uuid
        #, Posts.caption
        #, Posts.path
        , Posts.`date`
        , USERS.id
        , USERS.username
        #, USERS.fullname
        , USERS.profileImage
        , COALESCE(A.LikeCNT, 0)
    ORDER BY
          posts.`date` DESC
          ;
    

    See the second query working at SQLFiddle

    Results of query 2:

    |  id |    uuid | Comment_one | Comment_two |                      date |  id |        username | profileImage | num_likes |
    |-----|---------|-------------|-------------|---------------------------|-----|-----------------|--------------|-----------|
    | 145 | abcdefg | hla lah lah |  ha lah lah | October, 10 2016 00:00:00 | 145 | used_by_already | blah de blah |         0 |
    

    ** Option 2, concatenate the most recent comments into a single comma separated list **

    SELECT
          Posts.id
        , Posts.uuid
        , group_concat(rcom.commentText) Comments_two_concatenated
        #, Posts.caption
        #, Posts.path
        , Posts.`date`
        , USERS.id
        , USERS.username
        #, USERS.fullname
        , USERS.profileImage
        , COALESCE(A.LikeCNT, 0) num_likes
    FROM Posts
    INNER JOIN USERS ON Posts.id = 145
                AND USERS.id = 145
    LEFT JOIN (
              SELECT
                    COUNT(A.uuidPost) LikeCNT
                  , A.UUIDPost
              FROM Activity A
              WHERE type = 'like'
              GROUP BY
                    A.UUIDPOST
              ) A ON A.UUIDPost = Posts.uuid 
    LEFT JOIN (
          SELECT
                @row_num := IF(@prev_value=UUIDPOST,@row_num+1,1) as row_number
              , commentText
              , uuidPost
              , `date`
              , @prev_value := UUIDPOST
          FROM Activity
          CROSS JOIN ( SELECT @row_num := 1, @prev_value := '' collate utf8_unicode_ci  ) xy
          WHERE type = 'comment'
          ORDER BY
                uuidPost
              , `date` DESC
          ) rcom ON rcom.uuidPost  = Posts.UUID
                AND rcom.row_number <= 2
    GROUP BY
          Posts.id
        , Posts.uuid
        #, Posts.caption
        #, Posts.path
        , Posts.`date`
        , USERS.id
        , USERS.username
        #, USERS.fullname
        , USERS.profileImage
        , COALESCE(A.LikeCNT, 0)
    ORDER BY
          posts.`date` DESC
          
    

    See this third query working at SQLFiddle

    Results of query 3:

    |  id |    uuid | Comments_two_concatenated |                      date |  id |        username | profileImage | num_likes |
    |-----|---------|---------------------------|---------------------------|-----|-----------------|--------------|-----------|
    | 145 | abcdefg |    hla lah lah,ha lah lah | October, 10 2016 00:00:00 | 145 | used_by_already | blah de blah |         0 |
    

    ** Summary **

    I have presented 3 queries, each one shows only the 2 most recent comments, but each query does that in a different way. The first query (default behaviour) will display 2 rows for each post. Option 2 adds a column but removes the second row. Option 3 concatenates the 2 most recent comments.

    Please note that:

    • The question lacks table definitions covering all columns
    • The question lacks any sample data, which makes it harder for you to understand any results presented here, but also harder for us to prepare solutions
    • The question also lacks a definitive "expected result" (the wanted output) and this has led to further complexity in answering

    I do hope the additional provided information will be of some use, and that by now you also know that it is normal for SQL to present data as multiple rows. If you do not want that normal behaviour please be specific about what you do really want in your question.


    Postscript. To include yet another subquery for "follows" you may use a similar subquery to the one you already have. It may be added before or after that subquery. You may also see it in use at sqlfiddle here

    LEFT JOIN (
              SELECT
                    COUNT(*) FollowCNT
                  , IdOtherUser
              FROM Activity
              WHERE type = 'Follow'
              GROUP BY
                    IdOtherUser
              ) F ON USERS.id = F.IdOtherUser
    

    Whilst adding another subquery may resolve your desire for more information, the overall query may get slower in proportion to the growth of your data. Once you have settled on the functionality you really need it may be worthwhile considering what indexes you need on those tables. (I believe you would be advised to ask for that advice separately, and if you do make sure you include 1. the full DDL of your tables and 2. an explain plan of the query.)

    0 讨论(0)
提交回复
热议问题