Is there a simpler way to achieve this style of user messaging?

前端 未结 13 1142
轮回少年
轮回少年 2020-12-24 12:20

I have created a messaging system for users, it allows them to send a message to another user. If it is the first time they have spoken then a new conversation is initiated,

相关标签:
13条回答
  • 2020-12-24 12:48

    I think you do not need to create a userconversation table.

    If only user can have only one conversation with someone, the unique id for this thread is a concat between userId and friendId. So I move the friendId column in usersmessage table. The problem of order (friendId-userId is the same thread of userId-friendId) can be solved so:

    SELECT CONCAT(GREATEST(userId,FriendId),"_",LEAST(userId,FriendId)) AS threadId
    

    Now there is a problem of fetch the last message after a GROUP BY threadId.

    I think is a good solution make a concat between DATE and message and after a MAX on this field.

    I assume, for simplicity, column date is a DATETIME field ('YYYY-mm-dd H:i:s') but it not need because there is FROM_UNIXTIME function.

    So the final query is

    SELECT 
            CONCAT(GREATEST(userId,FriendId),"_",LEAST(userId,FriendId)) AS threadId,
            friendId, MAX(date) AS last_date, 
            MAX(CONCAT(date,"|",message)) AS last_date_and_message 
    
    FROM usermessages
    WHERE userId = :userId OR friendId = :userId
    GROUP BY threadId ORDER BY last_date DESC
    

    the result of field last_date_and_message is something like so:

    2012-05-18 00:18:54|Hi my friend this is my last message
    

    it can be simply parsed from your server side code.

    0 讨论(0)
  • 2020-12-24 12:50

    Extending the answer suggested by Watcher.

    You should consider dropping the "conversation" concept to simplify further.

    +----+---------+------+------------------+--------+----------+
    | id | message | read | time             | toUser | fromUser |
    +----+---------+------+------------------+--------+----------+
    | 1  |  test 1 |  0   | (some timestamp) |  3     |   4      |
    | 2  |  test 2 |  0   | (some timestamp) |  4     |   3      |
    +----+---------+------+------------------+--------+----------+
    

    List of all conversations for user 123:

    SELECT * FROM (
        SELECT id, message, toUser, fromUser   
        FROM userMessages 
        WHERE toUser = 123 OR fromUser = 123 
        ORDER BY id DESC
    ) AS internalTable 
    GROUP BY toUser, fromUser 
    

    List entire conversation between user 123 and user 456:

    SELECT * 
    FROM userMessages
    WHERE (toUser = 123 OR fromUser = 123) 
    AND (toUser = 456 OR fromUser = 456)
    ORDER BY time DESC
    
    0 讨论(0)
  • 2020-12-24 12:51

    hmm maybe i'm not understanding correctly your problem... but to me the solution is quite simple:

    SELECT c.*, MAX(m.time) as latest_post 
    FROM conversations as c 
    INNER JOIN messages as m ON c.id = m.conversation_id
    WHERE c.userId = 222 OR c.friendId = 222 
    GROUP BY c.id
    ORDER BY latest_post DESC
    

    here's my test data:

    Conversations :

    id  userId  friendId
    1   222     333
    2   222     444
    

    Messages :

    id  message     time (Desc)     conversation_id
    14  rty     2012-05-14 19:59:55     2
    13  cvb     2012-05-14 19:59:51     1
    12  dfg     2012-05-14 19:59:46     2
    11  ert     2012-05-14 19:59:42     1
    1   foo     2012-05-14 19:22:57     2
    2   bar     2012-05-14 19:22:57     2
    3   foo     2012-05-14 19:14:13     1
    8   wer     2012-05-13 19:59:37     2
    9   sdf     2012-05-13 19:59:24     1
    10  xcv     2012-05-11 19:59:32     2
    4   bar     2012-05-10 19:58:06     1
    6   zxc     2012-05-08 19:59:17     2
    5   asd     2012-05-08 19:58:56     1
    7   qwe     2012-05-04 19:59:20     1
    

    Query result :

    id  userId  friendId    latest_post
    2   222     444     2012-05-14 19:59:55
    1   222     333     2012-05-14 19:59:51
    

    If that's not it... just ignore my answer :P

    Hope this helps

    0 讨论(0)
  • 2020-12-24 12:55

    Since a given pair of users can have at most one conversation, there is no need to "invent" separate key just to identify conversations. Also, the wording of your question seems to suggest that a message is always sent to a single user, so I'd probably go with something like this:

    enter image description here

    Now, there are several things to note about this model:

    • It assumes messages between same two users cannot be generated more frequently than the resolution provided by the type used for SEND_TIME.1
    • The direction of the message is not determined by order of USER1_ID and USER2_ID, but with a separate flag (DIRECTION). This way, a message between given users will always have the same combination of USER1_ID and USER2_ID (enforced by the CHECK above), regardless of who sent and who received the message. This greatly simplifies querying.
    • It is unfortunate that all InnoDB tables are clustered, so the secondary index I1 is relatively expensive. There are ways to work around that, but the resulting complications are probably not worth it.

    With this data model, it becomes rather easy to sort the "conversations" (identified by user pairs) by the latest message. For example (replace 1 with desired user's USER_ID):

    SELECT *
    FROM (
        SELECT USER1_ID, USER2_ID, MAX(SEND_TIME) NEWEST
        FROM MESSAGE
        WHERE (USER1_ID = 1 OR USER2_ID = 1)
        GROUP BY USER1_ID, USER2_ID
    ) Q
    ORDER BY NEWEST DESC;
    

    (OR USER2_ID = 1 is the reason for the secondary index I1.)

    If you want not just latest times, but also latest messages, you can do something like this:

    SELECT * FROM MESSAGE T1
    WHERE
        (USER1_ID = 1 OR USER2_ID = 1)
        AND SEND_TIME = (
            SELECT MAX(SEND_TIME)
            FROM MESSAGE T2
            WHERE
                T1.USER1_ID = T2.USER1_ID
                AND T1.USER2_ID = T2.USER2_ID
        )
    ORDER BY SEND_TIME DESC;
    

    You can play with it in the SQL Fiddle.


    1 If that's not the case, you can use monotonically-incrementing INT instead, but you'll have to SELECT MAX(...) yourself since auto-increment doesn't work on PK subset; or simply make it PK alone and have secondary indexes on both USER1_ID and USER2_ID (fortunately, they would be slimmer since the PK is slimmer).

    0 讨论(0)
  • 2020-12-24 13:01

    Why are you breaking up the data into conversations?

    If it were me, I would use one table called 'usermessages' with the following format:

    +----+--------+----------+-------------+------------+--------+
    | id | userto | userfrom | timecreated | timeviewed | message|
    +----+--------+----------+-------------+------------+--------+
    

    A conversation is identified by the combination of the 'userto' and 'userfrom' columns. So, when you want to select all of a conversation:

    SELECT * FROM usermessages 
    WHERE (userto = :userto OR userto = :userfrom) 
    AND (userfrom = :userfrom OR userfrom = :userto) 
    ORDER BY timecreated DESC 
    LIMIT 10
    
    0 讨论(0)
  • 2020-12-24 13:01

    I would set it up like this

    Table details

    conversations (#id, last_message_id)
    
    participation (#uid1, #uid2, conversation_id)
    
    messages (#conversation_id, #id, uid, contents, read, *time)
    

    conversations

    This table will be used mainly to generate a new identifier for each conversation, together with a calculated field of the last update (for optimization). The two users have been disconnected from this table and moved into participation.

    participation

    This table records the conversations between two users in both directions; to explain why, take a look at the following key:

    ALTER TABLE `table` ADD PRIMARY(uid1, uid2);
    

    While this is good for both enforcing the uniqueness and simple lookups, you should be aware of the following behavior:

    • SELECT * FROM table WHERE uid1=1 AND uid2=2
    • SELECT * FROM table WHERE uid1=1
    • SELECT * FROM table WHERE uid1=1 AND uid2>5
    • SELECT * FROM table WHERE uid2=2

    The first two queries perform very well, MySQL also optimizes identity lookups on the first part of your key. The third one also yields pretty good performance as the second part of your key can be used for range queries. The last query doesn't perform well at all because the index is "left biased" and therefore it performs a full table scan.

    messages

    This table stores the actual sent messages, comprising the conversation identifier, sender id, contents, read flag and the time it was sent.

    Operation

    sending messages

    To determine whether a conversation between two users has already been established you can simply query the participation table:

    SELECT conversation_id FROM participation WHERE uid1=:sender_id AND uid2=:receiver_id
    

    If it does not yet exist, you create both records:

    INSERT INTO conversations (last_message_id) VALUES (NULL);
    # fetch last insert id here
    INSERT INTO participation VALUES (:sender_id, :receiver_id, :conversation_id), (:receiver_id, :sender_id, :conversation_id);
    INSERT INTO messages VALUES (:conversation_id, 0, :sender_id, :message_contents, 0, NOW());
    UPDATE conversations SET last_message_id=LAST_INSERT_ID() WHERE id = :conversation_id
    

    If the conversation is already setup: INSERT INTO messages VALUES (:conversation_id, 0, :sender_id, :message_contents, 0, NOW()); UPDATE conversations SET last_message_id=LAST_INSERT_ID() WHERE id = :conversation_id

    Note: the UPDATE statement can be scheduled as LOW_PRIORITY because you don't always have to be 100% correct.

    conversation overview

    This has become a simpler query:

    SELECT other_user.name, m.contents, m.read, c.id
    FROM participation AS p
    INNER JOIN user AS other_user ON other_user.id = p.uid2
    INNER JOIN conversation AS c ON c.id = p.conversation_id
    INNER JOIN messages AS m ON m.id = c.last_message_id
    WHERE p.uid1 = :user_id
    ORDER BY m.time DESC
    LIMIT 50
    

    Disclaimer: I have not tested this, but the write-up should make sense to you.

    Optimization

    Another reason why it's good to have a two-way table is so that it's prepared for sharding, a method in which you push related data into another database (on a different machine); based on certain rules you would determine where to fetch the information from.

    You could move the data in these ways:

    1. divide the participation table up based on the uid1 field
    2. divide the messages table up based on the conversation_id field

    The messages overview will get more complicated as you're likely being forced to make two queries; this can be mitigated somewhat with caches (and in extreme case document databases) though.

    Hope this gives you some ideas on future planning :)

    0 讨论(0)
提交回复
热议问题