问题
I'm building an R package, the main purpose of which is to abstract away the pain of dealing with a proprietary database that requires some fairly complex SQL queries in order get data out.
As such, the connection to the Microsoft SQL Server (obtained by odbcDriverConnect
) is a constant and important part of this package, but I can't work out how best to manage this and I'm hoping for advice as to how this should be implemented in R.
My current thoughts are:
Make the user ensure they have a valid connection before they call any function. Each function then has
connection
as a parameter which must be passed. This puts a burden on the user.In every function, make a call to
get.connection()
which will get new connection each time. Old connections are then allowed to timeout naturally, which seems a sloppy approach.As above, but return the same connection each time. This appears not to be a viable proposition as I can't prevent connections from timing out through R.
autoReconnect=TRUE
and other tricks I've used in different languages seem to have no effect.
In Java, I would probably have a DatabaseConnectionPool populated with a number of connections and simply grab connections from, and return them to, that pool as needed. I also don't seem to have the timeout issue in Java when I specify autoReconnect=TRUE
.
Any suggestions much appreciated.
回答1:
pool
is an R package for pooling connections such as databases. If you're cool to use a github package, take a look at https://github.com/rstudio/pool. Will reuse or recreate the connection as required.
回答2:
It seems that a mix between the second and the third approach is a reasonable solution i.e. getting the same connection each time, however before returning the connection you can check if it is still opened, otherwise create a new connection.
It is basically as if you are manually implementing autoReconnect=TRUE
来源:https://stackoverflow.com/questions/14779889/how-to-manage-a-database-connection-in-an-r-package