Does max connection pool also limits max connections to database?

后端 未结 1 582
囚心锁ツ
囚心锁ツ 2021-02-04 22:31

I am using hikari cp with spring boot app which has more that 1000 concurrent users. I have set the max pool size-

spring.datasource.hikari.maximum-pool-size=300         


        
1条回答
  •  日久生厌
    2021-02-04 22:40

    Yes, it's intended. Quoting the documentation:

    This property controls the maximum size that the pool is allowed to reach, including both idle and in-use connections. Basically this value will determine the maximum number of actual connections to the database backend. A reasonable value for this is best determined by your execution environment. When the pool reaches this size, and no idle connections are available, calls to getConnection() will block for up to connectionTimeout milliseconds before timing out. Please read about pool sizing. Default: 10

    So basically, when all 300 connections are in use, and you are trying to make your 301st connection, Hikari won't create a new one (as maximumPoolSize is the absolute maximum), but it will rather wait (by default 30 seconds) until a connection is available again.

    This also explains why you get the exception you mentioned, because the default (when not configuring a maximumPoolSize) is 10 connections, which you'll probably immediately reach.

    To solve this issue, you have to find out why these connections are blocked for more than 30 seconds. Even in a situation with 1000 concurrent users, there should be no problem if your query takes a few milliseconds or a few seconds at most.

    Increasing the pool size

    If you are invoking really complex queries that take a long time, there are a few possibilities. The first one is to increase the pool size. This however is not recommended, as the recommended formula for calculating the maximum pool size is:

    connections = ((core_count * 2) + effective_spindle_count)
    

    Quoting the About Pool Sizing article:

    A formula which has held up pretty well across a lot of benchmarks for years is that for optimal throughput the number of active connections should be somewhere near ((core_count * 2) + effective_spindle_count). Core count should not include HT threads, even if hyperthreading is enabled. Effective spindle count is zero if the active data set is fully cached, and approaches the actual number of spindles as the cache hit rate falls. ... There hasn't been any analysis so far regarding how well the formula works with SSDs.

    As described within the same article, that means that a 4 core server with 1 hard disk should only have about 10 connections. Even though you might have more cores, I'm assuming that you don't have enough cores to warrant the 300 connections you're making, let alone increasing it even further.


    Increasing connection timeout

    Another possibility is to increase the connection timeout. As mentioned before, when all connections are in use, it will wait for 30 seconds by default, which is the connection timeout.

    You can increase this value so that the application will wait longer before going in timeout. If your complex query takes 20 seconds, and you have a connection pool of 300 and 1000 concurrent users, you should theoretically configure your connection timeout to be at least 20 * 1000 / 300 = 67 seconds.

    Be aware though, that means that your application might take a long time before showing a response to the user. If you have a 67 second connection timeout and an additional 20 seconds before your complex query completes, your user might have to wait up to a minute and a half.


    Improve execution time

    As mentioned before, your primary goal would be to find out why your queries are taking so long. With a connection pool of 300, a connection timeout of 30 seconds and 1000 concurrent users, it means that your queries are taking at least 9 seconds before completing, which is a lot.

    Try to improve the execution time by:

    • Adding proper indexes.
    • Writing your queries properly.
    • Improve database hardware (disks, cores, network, ...)
    • Limit the amount of records you're dealing with by introducing pagination, ... .
    • Divide the work. Take a look to see if the query can be split into smaller queries that result in intermediary results that can then be used in another query and so on. As long as you're not working in transactions, the connection will be freed up in between, allowing you to serve multiple users at the cost of some performance.
    • Use caching
    • Precalculate the results: If you're doing some resource-heavy calculation, you could try to pre-calculate the results during a moment that the application isn't used as often, eg. at night and store those results in a different table that can be easily queried.
    • ...

    0 讨论(0)
提交回复
热议问题