How to handle too many concurrent connections even after using a connection pool?

问题

Scenario

Say you have a website or app that has tons of traffic. And even with a database connection pool, performance is taking a real hit (the site/app may even be crashing) because there are too many concurrent connections.

Question

What are someone's options for dealing with this problem?

My thoughts

I was thinking someone with this problem could create multiple databases (possibly on different machines although I'm not sure that's necessary), each with the same information and updated at the same time, which would grant a multiple of the original number of connections for a single database. But if the database is large that doesn't seem like a very viable solution.

回答1:

The stem is not specific enough to give a firm suggestion, but the complete list of what could be done is as follow:

Database cluster: Suitable for situations where you don't want to change your application layer and database is all you touch. There's a limit on how much you can get out of a database cluster. If your request volume keeps on growing, this solution will fail as well eventually. But the good news is that you've got all the functionality you've already had in an ordinary single-instance MySQL.
Sharding: Since your question is tagged with MySQL, and it does not support sharding on its own, if you want to use this solution you need to implement it in your application layer. In this solution you'll scatter your data over multiple databases (preferably in multiple MySQL instances on separate hardware) logically. It will be your responsibility to find the appropriate database holding your designated data. It's one of the most effective solutions ever but it's not always feasible. Its biggest flaw is that data scattered among two or more databases can not be included within a transaction.
Replication: Depending on your scenario you might be able to incorporate database replication and have copies of your data on them. This way you can connect to them instead of the master database and reduce the load on it. The default replication definition is master/slave scenario in which data flow is one way, from master to the slave. So changes you might make on the slave while will be applied on the salve, they won't be affecting the master. But there is also a master/master replication configuration in which data flow is in both ways. Yet you can not assume atomic integrity for concurrent data changes among both masters. In the end this solution is most effective if you plan to use it in master/slave mode and using slaves for read-only access.
Caching: Perhaps this solution should not be included here but since your stem does not reject it, here it goes. One of the ways to reduce database load is to cache its data once extracted. This solution can be beneficial specially if extracting data is expensive. There are many cache servers out there, like memcached or redis. This way you can omit so many of the database connections but only for extraction of data.
Other storage engines: You can always switch to more performant engines if your current one does not provide you with what you need. Of course this is only feasible if your needs allow you to. Nowadays there are NoSQL engines, much more performant than RDBMS, which support sharding natively and you can scale them linearly with minimum effort. There are also Lucene based solutions out there with powerful full-text search capabilities providing you with the same automatic sharding. In fact the only reason why you should be using a traditional RDBMS is the atomic behavior of transactions. But if transactions are not a must, there are much better solutions than RDBMS.

回答2:

If you don't already, you could try running your application on an application server -- to get some middleware behind your app. Most application servers will do their own connection pooling (because getting a connection from a web app to a database connection pool is still really really expensive). Additionally, you should be able to configure your application server to use shared connections -- which as the name implies will allow connections to be shared wherever possible.

In short, use an appserver. If you already are, maybe mention which one you're using and we can look at optimizing the server config from there.

回答3:

Replication -- Master plus any number of slaves. This gives you "unlimited" read scaling.

Disconnect -- A connection should not keep the connection open longer than necessary.

Unix, not Windows -- Need I elaborate?

InnoDB -- Use InnoDB, not MyISAM.

SlowLog -- Set long_query_time to 1 and watch for the top couple of queries; optimize them. See pt-query-digest for help in summarizing the slowlog.

回答4:

This is a tipical app scaling problem and many solutions have been devised - Google Big Table and Amazon Elastic products for instance. If moving into a cloud and taking advantage of the auto-scaling options they all provide is not an option then you'll need to create your own setup. Take a look at the docs for Postgres and MySQL, and you'll find that the ideas are pretty similar, including the concepts of

sharding: spread your client data into several databases and route clients requests to the right database instances.
Load Balancing: have your app deployed in several servers and use a middleware to route requests based on load on the server. It'll require some kind of DB synchronizarion tool like SymmetricDS to keep databases in sync.

This is by no means a full-blown overview of all your options but might help you get started.

回答5:

There are many things you should investigate for this problem.
- How many simultaneous connections are there. You can always increase ram and increase the number of max connections. MySQL can support millions of connections.

-make sure your app is closing connections. Even with a pool the app has to return connections to the pool.

-run database on separate server.

-make sure you have optimized queries. One long running query can slow a system down.

-finally use MySQL cluster if other approaches fail. With a high traffic site you may want to consider this to avoid single point of failure.

回答6:

In our case, we were also facing the same issue when mysql concurrent connection reached to 100.

Finally, we found a great npm express-myconnection module (https://www.npmjs.com/package/express-myconnection). It automatically release the connections when done. It supports Single and Pool connection strategies.

It works fine.

回答7:

I was running into similar issues, even though the app was supposedly closing it's connections I could see them stacking up in SQL as sleeping connections. After checking into the issue, I add the following to my connection string in webconfig with the following to test:

Connection Lifetime=600

This should have killed any sleeping connections after 10 mins - but it didn't...

On further review I had pending windows updates on both my web server and SQL server. And magically, the problem went away!

I wish I could have a more specific answer for you but somewhere between adding that "Connection Lifetime" and getting my web and SQL servers up to date with patches completely eliminated the issue for me. I've been clean for 3 weeks now, no issues.

来源：https://stackoverflow.com/questions/31525725/how-to-handle-too-many-concurrent-connections-even-after-using-a-connection-pool

标签

mysql

database

postgresql

concurrency

database-connection