PostgreSQL: improving pg_dump, pg_restore performance

后端 未结 6 2115
眼角桃花
眼角桃花 2020-12-22 16:15

When I began, I used pg_dump with the default plain format. I was unenlightened.

Research revealed to me time and file size improvements with pg_

相关标签:
6条回答
  • 2020-12-22 16:18
    zcat dumpfile.gz | pg_restore -d db_name
    

    Removes the full write of the uncompressed data to disk, which is currently your bottleneck.

    0 讨论(0)
  • 2020-12-22 16:19

    First check that you are getting reasonable IO performance from your disk setup. Then check that you PostgreSQL installation is appropriately tuned. In particular shared_buffers should be set correctly, maintenance_work_mem should be increased during the restore, full_page_writes should be off during the restore, wal_buffers should be increased to 16MB during the restore, checkpoint_segments should be increased to something like 16 during the restore, you shouldn't have any unreasonable logging on (like logging every statement executed), auto_vacuum should be disabled during the restore.

    If you are on 8.4 also experiment with parallel restore, the --jobs option for pg_restore.

    0 讨论(0)
  • 2020-12-22 16:21

    Improve pg dump&restore

    PG_DUMP | always use format-directory and -j options

    time pg_dump -j 8 -Fd -f /tmp/newout.dir fsdcm_external
    

    PG_RESTORE | always use tuning for postgres.conf and format-directory and -j options

    work_mem = 32MB
    shared_buffers = 4GB
    maintenance_work_mem = 2GB
    full_page_writes = off
    autovacuum = off
    wal_buffers = -1
    
    time pg_restore -j 8 --format=d -C -d postgres /tmp/newout.dir/
    
    0 讨论(0)
  • 2020-12-22 16:31

    Two issues/ideas:

    1. By specifying -Fc, the pg_dump output is already compressed. The compression is not maximal, so you may find some space savings by using "gzip -9", but I would wager it's not enough to warrant the extra time (and I/O) used compressing and uncompressing the -Fc version of the backup.

    2. If you are using PostgreSQL 8.4.x you can potentially speed up the restore from a -Fc backup with the new pg_restore command-line option "-j n" where n=number of parallel connections to use for the restore. This will allow pg_restore to load more than one table's data or generate more than one index at the same time.

    0 讨论(0)
  • 2020-12-22 16:37

    As you may have guessed simply by the fact that compressing the backup results in faster performance, your backup is I/O bound. This should come as no surprise as backup is pretty much always going to be I/O bound. Compressing the data trades I/O load for CPU load, and since most CPUs are idle during monster data transfers, compression comes out as a net win.

    So, to speed up backup/restore times, you need faster I/O. Beyond reorganizing the database to not be one huge single instance, that's pretty much all you can do.

    0 讨论(0)
  • 2020-12-22 16:40

    I assume you need backup, not a major upgrade of database.

    For backup of large databases you should setup continuous archiving instead of pg_dump.

    1. Set up WAL archiving.

    2. Make your base backups for example every day by using
      psql template1 -c "select pg_start_backup('`date +%F-%T``')" rsync -a --delete /var/lib/pgsql/data/ /var/backups/pgsql/base/ psql template1 -c "select pg_stop_backup()"`

    A restore would be as simple as restoring database and WAL logs not older than pg_start_backup time from backup location and starting Postgres. And it will be much faster.

    0 讨论(0)
提交回复
热议问题