Postgresql PITR backup: best practices to handle multiple databases?

问题

Hy guys, i have a postgresql 8.3 server with many database.

Actually, im planning to backup those db with a script that will store all the backup in a folder with the same name of the db, for example:

/mypath/backup/my_database1/
/mypath/backup/my_database2/
/mypath/backup/foo_database/

Every day i make 1 dump each 2 hours, overwriting the files every day... for example, in the my_database1 folder i have:

my_database1.backup-00.sql  //backup made everyday at the 00.00 AM
my_database1.backup-02.sql  //backup made everyday at the 02.00 AM
my_database1.backup-04.sql  //backup made everyday at the 04.00 AM
my_database1.backup-06.sql  //backup made everyday at the 06.00 AM
my_database1.backup-08.sql  //backup made everyday at the 08.00 AM
my_database1.backup-10.sql  //backup made everyday at the 10.00 AM
[...and so on...]

This is how i actually assure myself to be able to restore everydatabase loosing at least 2 hours of data.

2 hours still looks too much.

I've got a look to the postgresql pitr trought the WAL files, but, those files seem to contain all the data about all my database.

I'll need to separate those files, in the same way i do separate the dump files.

How to?

Otherwise, there is another easy-to-install to have a backup procedure that allo me to restore just 1 backup at 10 seconds earlier, but without creating a dump file every 10 seconds?

回答1:

Why do you want to separate the databases?

The way the PITR works, it is not possible to do since it works on the complete cluster. What you can do in that case is to create a data directory and a separate cluster for each of those databases (not recommended though since it will require different ports, and postmaster instances).

I believe that the benefits of using PITR instead of regular dumps outweigh having separate backups for each database, so perhaps you can re-think the reasons for why you need to separate it.

Another way could be to set up some replication with Slony-I but that would require a separate machine (or instance) that receives the data. On the other hand, that way you would have a replicated system in near real-time.

Update for comment:

To recover from mistakes, like deleting a table, PITR would be perfect since you can replay to a specific time. However, for 500 databases I understand that can be a lot of overhead. Slony-I would probably not work, since it is replicating. Not sure how it handles table deletions.

I am not aware of any other ways you can go. What I would do would probably still be going for PITR and just not do any mistakes ;). Jokes aside, depending how frequently mistakes are being made this could be a solution:

Set it up for PITR
have a second instance ready on standby.
When a mistake happens, replay the restore to the point in time on the second instance.
Do a pg_dump of the affected database from that instance.
Do a pg_restore on the production instance for that database.

However, it would require you to have a second instance ready, either on the same server or a different one (different is recommended). Also, the restore time would be a bit longer since it would require you to do one extra dump and restore.

回答2:

It is not possible with one instance of PostgresSQL.

You can divide your 500 tables between several instances, each listening on different port, but it would mean that they will not use resources like memory effectively (memory reserved but unused in one instance can not be used by another).

Slony will also not work here, as it does not replicate DDL statements, like dropping a table.

I'd recommend doing both:

continue to do your pg_dump backups, but try to smooth it - throttle pg_dump io bandwith, so it will not cripple a server, and run it continuously - when it finishes with the last database then immediately start with a first one;
additionally setup PITR.

This way you can restore a single database fast, but you can loose some data. If you'll decide that you cannot afford to loose that much data then you can restore your PITR backup to a temporary location (with fsync=off and pg_xlog symlinked to ramdisk for speed), pg_dump affected database from there and restore it to your main database.

回答3:

I think the way you are doing this is flawed. You should have one database with multiple schemas and roles. Then you can use PITR. However PITR is not a replacement for dumps.

来源：https://stackoverflow.com/questions/1130555/postgresql-pitr-backup-best-practices-to-handle-multiple-databases

标签

postgresql

multiple-databases

backup-strategies