HyperSQL (HSQLDB): massive insert performance

前端未结

关注

 3  1685

既然无缘 2021-02-11 02:48

I have an application that has to insert about 13 million rows of about 10 average length strings into an embedded HSQLDB. I\'ve been tweaking things (batch size, single threade

3条回答

不思量自难忘° (楼主)

2021-02-11 03:05
With CACHED tables, disk IO is taking most of the time. There is no need for multiple threads because you are inserting into the same table. One thing that noticably improves performance is the reuse of a single parameterized PreparedStatment, setting the parameters for each row insert.

On your machine, you can improve IO significantly by using a large NIO limit for memory-mapped IO. For example SET FILES NIO SIZE 8192. A 64 bit JVM is required for larger sizes to have an effect.

http://hsqldb.org/doc/2.0/guide/management-chapt.html

To reduce IO for the duration of the bulk insert use SET FILES LOG FALSE and do not perform a checkpoint until the end of the insert. The details are discussed here:

http://hsqldb.org/doc/2.0/guide/deployment-chapt.html#dec_bulk_operations

UPDATE: An insert test with 16 million rows below resulted in a 1.9 GigaByte .data file and took just a few minutes on an average 2 core processor and 7200 RPM disk. The key is large NIO allocation.
```
connection time -- 47
complete setup time -- 78 ms
insert time for 16384000 rows -- 384610 ms -- 42598 tps
shutdown time  -- 38109 
```
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...