Increase PostgreSQL write speed at the cost of likely data loss?

前端 未结 8 1050
执笔经年
执笔经年 2021-01-29 19:08

I love that PostgreSQL is crash resistant, as I don\'t want to spend time fixing a database. However, I\'m sure there must be some things I can disable/modify so that i

相关标签:
8条回答
  • 2021-01-29 19:38

    22 minutes for 1 million rows doesn't seem that slow, particularly if you have lots of indexes.

    How are you doing the inserts? I take it you're using batch inserts, not one-row-per-transaction.

    Does PG support some kind of bulk loading, like reading from a text file or supplying a stream of CSV data to it? If so, you'd probably be best advised to use that.

    Please post the code you're using to load the 1M records, and people will advise.

    Please post:

    • CREATE TABLE statement for the table you're loading into
    • Code you are using to load in
    • small example of the data (if possible)

    EDIT: It seems the OP isn't interested in bulk-inserts, but is doing a performance test for many single-row inserts. I will assume that each insert is in its own transaction.

    • Consider batching the inserts on the client-side, per-node, writing them into a temporary file (hopefully durably / robustly) and having a daemon or some periodic process which asynchronously does a batch insert of outstanding records, in reasonable sized batches.
    • This per-device batching mechanism really does give the best performance, in my experience, in audit-data like data-warehouse applications where the data don't need to go into the database just now. It also gives the application resilience against the database being unavailable.
    • Of course you will normally have several endpoint devices creating audit-records (for example, telephone switches, mail relays, web application servers), each must have its own instance of this mechanism which is fully independent.
    • This is a really "clever" optimisation which introduces a lot of complexity into the app design and has a lot of places where bugs could happen. Do not implement it unless you are really sure you need it.
    0 讨论(0)
  • 2021-01-29 19:44

    You should also increase checkpoint_segments (e.g. to 32 or even higher) and most probably wal_buffers as well

    Edit:
    if this is a bulk load, you should use COPY to insert the rows. It is much faster than plain INSERTs.

    If you need to use INSERT, did you consider using batching (for JDBC) or multi-row inserts?

    0 讨论(0)
提交回复
热议问题