问题
Production environment is on Azure, using Redis Cache Standard 2.5GB
.
Example 1
System.Web.HttpUnhandledException (0x80004005): Exception of type 'System.Web.HttpUnhandledException' was thrown. ---> StackExchange.Redis.RedisTimeoutException: Timeout performing SETNX User.313123, inst: 49, mgr: Inactive, err: never, queue: 0, qu: 0, qs: 0, qc: 0, wr: 0, wq: 0, in: 0, ar: 0, clientName: PRD-VM-WEB-2, serverEndpoint: Unspecified/Construct3.redis.cache.windows.net:6380, keyHashSlot: 15649, IOCP: (Busy=0,Free=1000,Min=1,Max=1000), WORKER: (Busy=1,Free=32766,Min=1,Max=32767) (Please take a look at this article for some common client-side issues that can cause timeouts: http://stackexchange.github.io/StackExchange.Redis/Timeouts) at StackExchange.Redis.ConnectionMultiplexer.ExecuteSyncImpl[T](Message message, ResultProcessor
1 processor, ServerEndPoint server) in c:\code\StackExchange.Redis\StackExchange.Redis\StackExchange\Redis\ConnectionMultiplexer.cs:line 2120 at StackExchange.Redis.RedisBase.ExecuteSync[T](Message message, ResultProcessor
1 processor, ServerEndPoint server) in c:\code\StackExchange.Redis\StackExchange.Redis\StackExchange\Redis\RedisBase.cs:line 81
Example 2
StackExchange.Redis.RedisTimeoutException: Timeout performing GET ForumTopic.33831, inst: 1, mgr: Inactive, err: never, queue: 2, qu: 0, qs: 2, qc: 0, wr: 0, wq: 0, in: 0, ar: 0, clientName: PRD-VM-WEB-2, serverEndpoint: Unspecified/Construct3.redis.cache.windows.net:6380, keyHashSlot: 5851, IOCP: (Busy=0,Free=1000,Min=1,Max=1000), WORKER: (Busy=1,Free=32766,Min=1,Max=32767) (Please take a look at this article for some common client-side issues that can cause timeouts: http://stackexchange.github.io/StackExchange.Redis/Timeouts) at StackExchange.Redis.ConnectionMultiplexer.ExecuteSyncImpl[T](Message message, ResultProcessor
1 processor, ServerEndPoint server) in c:\code\StackExchange.Redis\StackExchange.Redis\StackExchange\Redis\ConnectionMultiplexer.cs:line 2120 at StackExchange.Redis.RedisBase.ExecuteSync[T](Message message, ResultProcessor
1 processor, ServerEndPoint server) in c:\code\StackExchange.Redis\StackExchange.Redis\StackExchange\Redis\RedisBase.cs:line 81 at StackExchange.Redis.RedisDatabase.StringGet(RedisKey key, CommandFlags flags) in c:\code\StackExchange.Redis\StackExchange.Redis\StackExchange\Redis\RedisDatabase.cs:line 1647 at C3.Code.Controls.Application.Caching.Distributed.DistributedCacheController.Get[T](String cacheKey) in C:\Construct.net\Source\C3Alpha2\Code\Controls\Application\Caching\Distributed\DistributedCacheController.cs:line 115 at C3.Code.Controls.Application.Caching.Manager.Manager.Get[T](String key, Func`1 getFromExternFunction, Boolean skipLocalCaches) in C:\Construct.net\Source\C3Alpha2\Code\Controls\Application\Caching\Manager\Manager.cs:line 159 at C3.PageControls.Forums.TopicRender.Page_Load(Object sender, EventArgs e) in C:\Construct.net\Source\C3Alpha2\PageControls\Forums\TopicRender.ascx.cs:line 40 at System.Web.UI.Control.OnLoad(EventArgs e) at System.Web.UI.Control.LoadRecursive() at System.Web.UI.Control.LoadRecursive() at System.Web.UI.Control.LoadRecursive() at System.Web.UI.Control.LoadRecursive() at System.Web.UI.Control.LoadRecursive() at System.Web.UI.Control.LoadRecursive() at System.Web.UI.Control.LoadRecursive() at System.Web.UI.Page.ProcessRequestMain(Boolean includeStagesBeforeAsyncPoint, Boolean includeStagesAfterAsyncPoint)
These errors are sporadic, several times a day.
Is this an Azure network blip, or something I can reduce? Looking at the numbers in the error doesn't seem anything out of the ordinary, and the server load never seems to go above 7% as reported by Azure.
Redis connection
internal static class RedisController
{
private static readonly object GetConnectionLock = new object();
public static ConnectionMultiplexer GetConnection()
{
if (Global.RedisConnection == null)
{
lock (GetConnectionLock)
{
if (Global.RedisConnection == null)
{
Global.RedisConnection = ConnectionMultiplexer.Connect(
Settings.Deployment.RedisConnectionString);
}
}
}
return Global.RedisConnection;
}
回答1:
There are 3 scenarios that can cause timeouts, and it is hard to know which is in play:
- the library is tripping over; in particular, there are known issues relating to the TLS implementation and how we handle the read loop in the v1.* version of the library - something that we have invested a lot of time working on for v2.* (however: it is not always trivial to update to v2, especially if you're using the library as part of other code that depend on a specific version)
- the server/network is tripping over; this is a very real possibility - looking at "slowlog" can help if it is server-side, but I don't have any visibility of that
- the server and network are fine, and the library is doing what it can, but there are some huge blobs flying between client and server that are delaying other operations; this is something that I'm making changes to help identify right now, and if this shows itself to be a common problem, we'll perhaps look at making better use of concurrent connections (which doesn't increase bandwidth, but can reduce latency for blocked operations) - this would be a v2 only change, note
回答2:
Lazy Connection
As a best practice make sure you are using the following pattern to connect to the StackExchange Redis client:
private static Lazy<ConnectionMultiplexer> lazyConnection = new Lazy<ConnectionMultiplexer>(() => {
return ConnectionMultiplexer.Connect("cachename.redis.cache.windows.net,ssl=true,abortConnect=false,password=password");
});
public static ConnectionMultiplexer Connection {
get {
return lazyConnection.Value;
}
}
If the above does not work, there are some more debugging routes described in Source 1, regarding region, bandwidth and NuGet package versions among others.
IO Threads
Another option could be to increase the minimum IO threads. It’s often recommend to set the minimum configuration value for IOCP and WORKER threads to something larger than the default value. There is no one-size-fits-all guidance on what this value should be because the right value for one application will be too high/low for another application. A good starting place is 200 or 300, then test and tweak as needed.
How to configure this setting:
- In ASP.NET, use the
minIoThreads
configuration setting under the <processModel> configuration element in machine.config. According to Microsoft, you can’t change this value per site by editing your web.config (even when you could do it in the past), so the value that you choose here is the value that all your .NET sites will use. Please note that you don’t need to add every property if you have autoConfig set to false, just puttingautoConfig="false"
and overriding the value is enough:<processModel autoConfig="false" minIoThreads="250" />
Important Note: the value specified in this configuration element is a per-core setting. For example, if you have a 4 core machine and want your minIOThreads setting to be 200 at runtime, you would use
<processModel minIoThreads="50"/>
.
- Outside of ASP.NET, use the ThreadPool.SetMinThreads() API.
- In .Net Core, add Environment Variable COMPlus_ThreadPool_ForceMinWorkerThreads to overwrite default MinThreads setting, according to Environment/Registry Configuration Knobs - You can also use the same
ThreadPool.SetMinThreads()
method as described above.
Sources:
- Microsoft Azure - Investigating timeout exceptions in StackExchange.Redis for Azure Redis Cache
- StackExchange.Redis
回答3:
Have the network traffic monitor switched on to confirm/deny the blip.have a solution to the issue but a crude one. Option 1 - try restarting the managed redis instamce in azure.
回答4:
My guess is that there is an issue with network stability - thus the timeouts.
Since nobody has mentioned an increase in the responseTimeout
I would play around with it. The default value is 50ms which can be easily reached. I would try it around 200ms to see if that would help with teh messages.
Taken from the configuration options:
responseTimeout={int} ResponseTimeout SyncTimeout Time (ms) to decide whether the socket is unhealthy
There are multiple issues opened on this on github. The one combining all is probably #871 The "network stability" / 2.0 / "pipelines" rollup issue
One more thing: did you try to play around with ConnectionMultiplexer.ConnectAsync()
instead ConnectionMultiplexer.Connect()
?
来源:https://stackoverflow.com/questions/51651796/stackexchange-redis-timeout