Postgres insert optimization

前端未结

关注

 7  1644

I have a script that generates tens of thousands of inserts into a postgres db through a custom ORM. As you can imagine, it\'s quite slow. This is used for development purpose

相关标签:

7条回答

伪装坚强ぢ

2021-02-04 18:47
The fastest way to insert data would be the COPY command. But that requires a flat file as its input. I guess generating a flat file is not an option.

Don't commit too often, especially do not run this with autocommit enabled. "Tens of thousands" sounds like a single commit at the end would be just right.

If you can convice your ORM to make use of Postgres' multi-row insert that would speed up things as well

This is an example of a multi-row insert:
```
insert into my_table (col1, col2) 
values 
(row_1_col_value1, row_1_col_value_2), 
(row_2_col_value1, row_2_col_value_2), 
(row_3_col_value1, row_3_col_value_2)
```
If you can't generate the above syntax and you are using Java make sure you are using batched statements instead of single statement inserts (maybe other DB layers allow something similar)

Edit:

jmz' post inspired me to add something:

You might also see an improvement when you increase wal_buffers to some bigger value (e.g. 8MB) and checkpoint_segments (e.g. 16)
0 讨论(0)
发布评论:

提交评论
- 加载中...
失恋的感觉

2021-02-04 18:49

One thing you can do is remove all indexs, do your inserts, and then recreate the indexes.

0 讨论(0)
发布评论:

提交评论
- 加载中...
余生分开走

2021-02-04 18:51

Are you sending a batch of tens of thousands of INSERTs OR are you sending tens of thousands of INSERTs?

I know with Hibernate you can batch all your SQL statements up and send them at the end in one big chunk instead of taking the tax of network and database overhead of making thousands of SQL statements individually.

0 讨论(0)
发布评论:

提交评论
- 加载中...
一整个雨季

2021-02-04 18:54

If you don't need that kind of functionality in production environment, I'd suggest you turn fsync off from your PostgreSQL config. This will speed up the inserts dramatically.

Never turn off fsync on a production database.

0 讨论(0)
发布评论:

提交评论
- 加载中...
伪装坚强ぢ

2021-02-04 19:07
For inserts that number in the hundreds to thousands, batch them:
```
begin;
insert1 ...
insert2 ...
...
insert10k ... 
commit;
```
For inserts in the millions use copy:
```
COPY test (ts) FROM stdin;
2010-11-29 22:32:01.383741-07
2010-11-29 22:32:01.737722-07
... 1Million rows
\.
```
Make sure any col used as an fk in another table is indexed if it's more than trivial in size in the other table.
0 讨论(0)
发布评论:

提交评论
- 加载中...
伪装坚强ぢ

2021-02-04 19:11
If you are just initializing constant test data, you could also put the test data into a staging table(s), then just copy the table contents, using
```
INSERT INTO... SELECT...
```
that should be about as fast as using COPY (though I did not benchmark it), with the advantage that you can copy using just SQL commands, without the hassle of setting up an external file like for COPY.
0 讨论(0)
发布评论:

提交评论
- 加载中...

1 2 下一页