WCF + NetTcp: high load make the channel stop working (calls/second rate)

£可爱£侵袭症+ 提交于 2020-02-21 05:00:07

问题


First of all, sorry, i'm not fluent.

I'm trying to figure out why my WCF services stop working when we have an environment with high calls/second rate. I'm not sure that just increasing timeout will solve the issue.

We have 2 webservices:

  • The first is hosted on IIS 7.5, Windows Server 2008 R2 Enterprise SP1 x64, with AppFabric (and WAS)
  • Second, hosted on Windows Service, Windows 2003 R2 SP1 x86

Both webservices have minimum configuration: No authentication, No trasaction, Without special treating of message.. check the binding:

<netTcpBinding>
    <binding  transactionFlow="false">
      <security mode="None">
        <message clientCredentialType="None" />
        <transport clientCredentialType="None"></transport>
      </security>
      <reliableSession enabled="false"/>
    </binding>
  </netTcpBinding>

We are trying to use Net.Tcp binding because of its realibility and velocity.

FACT 1 - Net.Tcp Binding is primary reason

When the load is high, the channel Net.Tcp stop working. That's it! But the BasicHttp still working like a charm.

The WindowsService: the channel net.tcp last down for some minutes (3m - 10m) before get working back (BY ITSELF, without we change anything. Goblins are working hard).

The AppFabric/IIS/WAS: the channel net.tcp keep down. Need manual restart.

The BasicHttpBinding configuration is similar to net.tcp: without any treating of the message, whitout security concerns or something like that.

FACT 2 - Without any kind of logging

We couldn't find any kind, tip, trick to figure out what's happening. I have tried Dump the memory, event logs, System.Diagnostics and nothing relevant. The most relevant tip is an Error from SMSvcHost 4.0.0.0:

An error occurred while dispatching a duplicated socket: this handle is now leaked in the process. ID: 2272 Source: System.ServiceModel.Activation.TcpWorkerProcess/62875109 Exception: System.TimeoutException: This request operation sent to http://schemas.microsoft.com/2005/12/ServiceModel/Addressing/Anonymous did not receive a reply within the configured timeout (00:01:00). The time allotted to this operation may have been a portion of a longer timeout. This may be because the service is still processing the operation or because the service was unable to send a reply message. Please consider increasing the operation timeout (by casting the channel/proxy to IContextChannel and setting the OperationTimeout property) and ensure that the service is able to connect to the client.

Server stack trace: at System.Runtime.AsyncResult.End[TAsyncResult](IAsyncResult result)
at System.ServiceModel.Channels.ServiceChannel.SendAsyncResult.End(SendAsyncResult result) at System.ServiceModel.Channels.ServiceChannel.EndCall(String action, Object[] outs, IAsyncResult result) at System.ServiceModel.Channels.ServiceChannelProxy.InvokeEndService(IMethodCallMessage methodCall, ProxyOperationRuntime operation) at System.ServiceModel.Channels.ServiceChannelProxy.Invoke(IMessage message)

Exception rethrown at [0]: at System.Runtime.AsyncResult.End[TAsyncResult](IAsyncResult result)
at System.ServiceModel.Activation.WorkerProcess.EndDispatchSession(IAsyncResult result) Process Name: SMSvcHost Process ID: 1532

Do you have any tip or configuration trick to help me solve this issue?

Whats the best configuration for high load scenarios?


回答1:


If you generated a service reference in Visual Studio, or with the svcutil tool, make sure you always call the Close or Abort methods of your proxies. I encountered a similar problem some days ago because I forgot to call these methods.




回答2:


In case you are calling the Close() and Abort() methods accordingly and still receive this error consider the following scenario:

  1. You run a Microsoft .NET Framework 3.0-based or .NET Framework 3.5-based Windows Communication Foundation (WCF) service.

  2. The WCF service uses the Net.Tcp Port Sharing Service (Smsvchost.exe) and is hosted on a computer that is running Internet Information Services (IIS).

  3. One of the following conditions is true:

    • The CPU usage is high on the computer that is running IIS.
    • A throttle occurs in a service model for the WCF service.
    • Multiple requests are sent to the WCF service at the same time.

In this scenario, the WCF service takes longer than one minute to process a request from a client application. Additionally, an error message that assembles the following event entry is logged in the event log:

Log Name: System

Source: SMSvcHost 3.0.0.0

Date:

Event ID: 8

Task Category: Sharing Service

Level: Error

Keywords: Classic

User: LOCAL SERVICE

Computer:

Description: An error occurred while dispatching a duplicated socket: this handle is now leaked in the process.

ID: 2620

Source: System.ServiceModel.Activation.TcpWorkerProcess

Exception:

System.TimeoutException: This request operation sent to did not receive a reply within the configured timeout (00:01:00). The time allotted to this operation may have been a portion of a longer timeout. This may be because the service is still processing the operation or because the service was unable to send a reply message. Please consider increasing the operation timeout (by casting the channel/proxy to IContextChannel and setting the OperationTimeout property) and ensure that the service is able to connect to the client.

Note: You must restart IIS to recover the WCF service from this issue.

Cause:

This issue occurs because of the Smsvchost.exe process times out after one minute when it tries to transfer an incoming connection request to the W3wp.exe worker process. Additionally, this time-out is not configurable.

When the CPU has a heavy workload, or when many concurrent connection requests are incoming, the Smsvchost.exe process cannot transfer the incoming connection to the W3wp.exe worker process within one minute. Therefore, the Smsvchost.exe process times out and eventually stops responding. When this issue occurs, the Smsvchost.exe process cannot route later requests to the W3wp.exe worker process until IIS is restarted.

Solution:

Microsoft suggests applying the hot fix 2504602 that is described in Microsoft Knowledge Base (KB) article. This hot fix is available for WCF in the .NET Framework 3.0 SP2, in the .NET Framework 3.5 SP1 and the .NET Framework 4.

In addition, Microsoft claims to have solved this issue in the .Net Framework 4.5, therefore, you should upgrade to the latest version.

In case you upgrade to the .Net Framework 4.5 and the problem persists the workaround is to modify the smsvchost.exe.config file to increase timeout and pending accepts and various other parameters.



来源:https://stackoverflow.com/questions/12377901/wcf-nettcp-high-load-make-the-channel-stop-working-calls-second-rate

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!