I have two servers connecting to a PostgresSQL 9.6
db hosted on Azure. The servers are doing one thing - hitting the Postgres db with a SELECT 1
qu
You need to set a min pool size. Doing so ensures that this amount of connections remains open to the DB regardless of the pool usage.
By default (at least for NPGSQL), the min size is 0, so if the connection is not used for a while, it will be closed.
In your test, you do one call every 5 seconds, which is not much, and the pool might decide to close the unused connection. According to the doc it should keep it open for 300 seconds though, not just 15
The first call is almost exactly 5 seconds longer than the rest. This looks like an IP address resolution issue to me. It first picks a method which is defective for the given server, then after 5 seconds it times-out and picks a different method, which works. Then it is cached for a while, and so continues to work well until the cached entry expires.
To see if this is the problem, hardcode the IP address for the database host into your "hosts" file, and see if that fixes the problem. If so, then the root cause becomes a question for your network engineers.
On the database side, you can turn on slow query logging, either log_min_duration_statement
or better yet auto_explain.log_min_duration
. But if my theory is correct, this won't show anything. The database doesn't know how long you spent trying to look up its IP address.
It is possible that the first time the query needs to bring a lot of data from disk to memory, and the subsequent executions find everything already in the shared buffers. You can know this by running
EXPLAIN (ANALYZE, BUFFERS) <your query>
The amount of 'read' and 'hit' will tell you how much as been read from disk, and how much has been hit in RAM.