How do I find the last time that a PostgreSQL database has been updated?

前端 未结 5 1940
我寻月下人不归
我寻月下人不归 2021-02-01 05:52

I am working with a postgreSQL database that gets updated in batches. I need to know when the last time that the database (or a table in the database)has been updated or modifie

相关标签:
5条回答
  • 2021-02-01 06:19

    I like Jack's approach. You can query the table stats and know the number of inserts, updates, deletes and so:

    select n_tup_upd from pg_stat_user_tables  where relname = 'YOUR_TABLE';
    

    every update will increase the count by 1.

    bare in mind this method is viable when you have a single DB. multiple instances will require different approach probably.

    0 讨论(0)
  • 2021-02-01 06:28

    See the following article:

    MySQL versus PostgreSQL: Adding a 'Last Modified Time' Column to a Table http://www.pointbeing.net/weblog/2008/03/mysql-versus-postgresql-adding-a-last-modified-column-to-a-table.html

    0 讨论(0)
  • 2021-02-01 06:31

    You can write a stored procedure in an "untrusted language" (e.g. plpythonu): This allows access to the files in the postgres "base" directory. Return the larges mtime of these files in the stored procedure.

    But this is only vague, since vacuum will change these files and the mtime.

    0 讨论(0)
  • 2021-02-01 06:42

    You can write a trigger to run every time an insert/update is made on a particular table. The common usage is to set a "created" or "last_updated" column of the row to the current time, but you could also update the time in a central location if you don't want to change the existing tables.

    So for example a typical way is the following one:

    CREATE FUNCTION stamp_updated() RETURNS TRIGGER LANGUAGE 'plpgsql' AS $$
    BEGIN
      NEW.last_updated := now();
      RETURN NEW;
    END
    $$;
    -- repeat for each table you need to track:
    ALTER TABLE sometable ADD COLUMN last_updated TIMESTAMP;
    CREATE TRIGGER sometable_stamp_updated
      BEFORE INSERT OR UPDATE ON sometable
      FOR EACH ROW EXECUTE PROCEDURE stamp_updated();
    

    Then to find the last update time, you need to select "MAX(last_updated)" from each table you are tracking and take the greatest of those, e.g.:

    SELECT MAX(max_last_updated) FROM (
      SELECT MAX(last_updated) AS max_last_updated FROM sometable
      UNION ALL
      SELECT MAX(last_updated) FROM someothertable
    ) updates
    

    For tables with a serial (or similarly-generated) primary key, you can try avoid the sequential scan to find the latest update time by using the primary key index, or you create indices on last_updated.

    -- get timestamp of row with highest id
    SELECT last_updated FROM sometable ORDER BY sometable_id DESC LIMIT 1
    

    Note that this can give slightly wrong results in the case of IDs not being quite sequential, but how much accuracy do you need? (Bear in mind that transactions mean that rows can become visible to you in a different order to them being created.)

    An alternative approach to avoid adding 'updated' columns to each table is to have a central table to store update timestamps in. For example:

    CREATE TABLE update_log(table_name text PRIMARY KEY, updated timestamp NOT NULL DEFAULT now());
    CREATE FUNCTION stamp_update_log() RETURNS TRIGGER LANGUAGE 'plpgsql' AS $$
    BEGIN
      INSERT INTO update_log(table_name) VALUES(TG_TABLE_NAME);
      RETURN NEW;
    END
    $$;
    -- Repeat for each table you need to track:
    CREATE TRIGGER sometable_stamp_update_log
     AFTER INSERT OR UPDATE ON sometable
     FOR EACH STATEMENT EXECUTE stamp_update_log();
    

    This will give you a table with a row for each table update: you can then just do:

    SELECT MAX(updated) FROM update_log
    

    To get the last update time. (You could split this out by table if you wanted). This table will of course just keep growing: either create an index on 'updated' (which should make getting the latest one pretty fast) or truncate it periodically if that fits with your use case, (e.g. take an exclusive lock on the table, get the latest update time, then truncate it if you need to periodically check if changes have been made).

    An alternative approach- which might be what the folks on the forum meant- is to set 'log_statement = mod' in the database configuration (either globally for the cluster, or on the database or user you need to track) and then all statements that modify the database will be written to the server log. You'll then need to write something outside the database to scan the server log, filtering out tables you aren't interested in, etc.

    0 讨论(0)
  • 2021-02-01 06:42

    It looks like you can use pg_stat_database to get a transaction count and check if this changes from one backup run to the next - see this dba.se answer and comments for more details

    0 讨论(0)
提交回复
热议问题