Select unlocked row in Postgresql

后端 未结 14 2053
你的背包
你的背包 2020-12-13 06:20

Is there a way to select rows in Postgresql that aren\'t locked? I have a multi-threaded app that will do:

Select... order by id desc limit 1 for update
         


        
14条回答
  •  囚心锁ツ
    2020-12-13 06:41

    No No NOOO :-)

    I know what the author means. I have a similar situation and i came up with a nice solution. First i will start from describing my situation. I have a table i which i store messages that have to be sent at a specific time. PG doesn't support timing execution of functions so we have to use daemons (or cron). I use a custom written script that opens several parallel processes. Every process selects a set of messages that have to be sent with the precision of +1 sec / -1 sec. The table itself is dynamically updated with new messages.

    So every process needs to download a set of rows. This set of rows cannot be downloaded by the other process because it will make a lot of mess (some people would receive couple messages when they should receive only one). That is why we need to lock the rows. The query to download a set of messages with the lock:

    FOR messages in select * from public.messages where sendTime >= CURRENT_TIMESTAMP - '1 SECOND'::INTERVAL AND sendTime <= CURRENT_TIMESTAMP + '1 SECOND'::INTERVAL AND sent is FALSE FOR UPDATE LOOP
    -- DO SMTH
    END LOOP;
    

    a process with this query is started every 0.5 sec. So this will result in the next query waiting for the first lock to unlock the rows. This approach creates enormous delays. Even when we use NOWAIT the query will result in a Exception which we don't want because there might be new messages in the table that have to be sent. If use simply FOR SHARE the query will execute properly but still it will take a lot of time creating huge delays.

    In order to make it work we do a little magic:

    1. changing the query:

      FOR messages in select * from public.messages where sendTime >= CURRENT_TIMESTAMP - '1 SECOND'::INTERVAL AND sendTime <= CURRENT_TIMESTAMP + '1 SECOND'::INTERVAL AND sent is FALSE AND is_locked(msg_id) IS FALSE FOR SHARE LOOP
      -- DO SMTH
      END LOOP;
      
    2. the mysterious function 'is_locked(msg_id)' looks like this:

      CREATE OR REPLACE FUNCTION is_locked(integer) RETURNS BOOLEAN AS $$
      DECLARE
          id integer;
          checkout_id integer;
          is_it boolean;
      BEGIN
          checkout_id := $1;
          is_it := FALSE;
      
          BEGIN
              -- we use FOR UPDATE to attempt a lock and NOWAIT to get the error immediately 
              id := msg_id FROM public.messages WHERE msg_id = checkout_id FOR UPDATE NOWAIT;
              EXCEPTION
                  WHEN lock_not_available THEN
                      is_it := TRUE;
          END;
      
          RETURN is_it;
      
      END;
      $$ LANGUAGE 'plpgsql' VOLATILE COST 100;
      

    Of course we can customize this function to work on any table you have in your database. In my opinion it is better to create one check function for one table. Adding more things to this function can make it only slower. I takes longer to check this clause anyways so there is no need to make it even slower. For me this the complete solution and it works perfectly.

    Now when i have my 50 processes running in parallel every process has a unique set of fresh messages to send. Once the are sent i just update the row with sent = TRUE and never go back to it again.

    I hope this solution will also work for you (author). If you have any question just let me know :-)

    Oh, and let me know if this worked for you as-well.

提交回复
热议问题