HikariCP: What database level timeouts should be considered to set maxLifetime for Oracle 11g

后端 未结 1 1962
难免孤独
难免孤独 2020-12-02 02:19

In the documentation for HikariCP, it is mentioned that

We strongly recommend setting this value, and it should be at least 30 seconds less than any d

相关标签:
1条回答
  • 2020-12-02 02:42

    Short answer: none (by default).


    For the record (to include details here in case the link changes), we're talking about property maxLifetime of HikariCP:

    This property controls the maximum lifetime of a connection in the pool. An in-use connection will never be retired, only when it is closed will it then be removed. We strongly recommend setting this value, and it should be at least 30 seconds less than any database or infrastructure imposed connection time limit. A value of 0 indicates no maximum lifetime (infinite lifetime), subject of course to the idleTimeout setting. Default: 1800000 (30 minutes)

    In my experience, it's a good thing that HikariCP does that. As far as I can tell by default Oracle does not enforce a max lifetime for connections (neither on JDBC driver side (1), nor on server side(2)). So in this respect, the "infrastructure-imposed connection time limit" is +infinity -- and that was a problem for us, as we did observe issues with long-lived connections. It also means any value is "at least 30 seconds less", including the default :)

    I suppose the connection layer does not do anything about this because it counts on the pool layer above to handle such things. It was not possible with (now deprecated) implicit connection pool, and I don't know if UCP (the replacement) does that, but if you use HikariCP you don't use those.

    Now, after 30 minutes (usually after many reuses for various purposes) for a given connection, HikariCP closes it and creates a fresh one. That has a very minor cost, and fixed our issues with long-lived connections. We're happy with that default, but still make it configurable just in case (see 2 below).

    (1) OracleDataSource does not offer any configuration point (property or system property) to control that, and I observed infinite lifetime.

    (2) For server-side limits, see profile parameter IDLE_TIME. Quoting this answer:

    Oracle by default will not close a connection due to inactivity. You can configure a profile with an IDLE_TIME to cause Oracle to close inactive connections.

    To verify what is the value of IDLE_TIME for your user, combining answers from this Q&A:

    select p.limit
    from dba_profiles p, dba_users u
    where p.resource_name = 'IDLE_TIME' and p.profile = u.profile and u.username = '...'
    ;
    

    Default value is UNLIMITED.

    Please note there can be other limits enforced elsewhere (firewall... yes I've been bitten by that, although most DB systems have a keep-alive mechanism) that might interfere. So you'd better make it configurable, in case such issues are discovered when you deploy your product.


    On Linux, you can verify max lifetime of physical connections by monitoring TCP sockets connected to your database. I've been running script below on my server (from the DB point of view that's the client host), it takes 1 argument, the ip:port of your oracle node, as it appears in output of netstat -tan (or a pattern if you have several nodes).

    #!/bin/bash
    target="$1"
    dir=$(mktemp -d)
    while sleep 10
    do
        echo "------------ "$(date)
        now=$(date +%s)
        netstat -tan | grep " $target " | awk '{print $4}' | cut -f2 -d: | while read port
        do
            file="p_$port"
            [ ! -e $file ] && touch $file
            ftime=$(stat -c %Z "$file")
            echo -e "$port :\t "$(( now - ftime))
        done
    done
    \rm "$dir"/p_*
    \rmdir "$dir"
    

    If you run that and stop it with ctrl-c during sleep time, it should exit the loop and clean up the temp directory, but this is not 100% foolproof

    In the results none of the ports should show a value that exceeds 1800 seconds (ie 30 minutes), give or take a minute. See example output below, first sample shows 2 sockets above 1800s, they're gone 10s later.

    ------------ Thu Jul 6 16:09:00 CEST 2017
    49806 :  1197
    49701 :  1569
    49772 :  1348
    49782 :  1317
    49897 :  835
    49731 :  1448
    49620 :  1830
    49700 :  1569
    49986 :  523
    49722 :  1498
    49715 :  1509
    49711 :  1539
    49629 :  1820
    49732 :  1448
    50026 :  332
    49849 :  1036
    49858 :  1016
    ------------ Thu Jul 6 16:09:10 CEST 2017
    49806 :  1207
    49701 :  1579
    49772 :  1358
    49782 :  1327
    49897 :  845
    49731 :  1458
    49700 :  1579
    49986 :  533
    49722 :  1508
    49715 :  1519
    49711 :  1549
    49732 :  1458
    50026 :  342
    49849 :  1046
    49858 :  1026
    

    You'll need to run the script for more than 30 minutes to see that, because it does not know age of sockets that existed before

    0 讨论(0)
提交回复
热议问题