问题
I use a lot of client sends a request to the server about 1000 requests per second a client, the server's CPU soon rose to 600% (8 cores), and always maintain this state. When I use jstack printing process content, I found SelectorImpl is BLOCKED state. Records are as follows:
nioEventLoopGroup-4-1 prio=10 tid=0x00007fef28001800 nid=0x1dbf waiting for monitor entry [0x00007fef9eec7000]
java.lang.Thread.State: BLOCKED (on object monitor)
at sun.nio.ch.EPollSelectorImpl.doSelect(Unknown Source)
- waiting to lock <0x00000000c01f1af8> (a java.lang.Object)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(Unknown Source)
- locked <0x00000000c01d9420> (a io.netty.channel.nio.SelectedSelectionKeySet)
- locked <0x00000000c01f1948> (a java.util.Collections$UnmodifiableSet)
- locked <0x00000000c01d92c0> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(Unknown Source)
at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:635)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:319)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
at java.lang.Thread.run(Unknown Source)
High CPU has something to do with this? Another problem is that when I connect a lot of clients, find some client will connect, an error is as follows:
"nioEventLoopGroup-4-1" prio=10 tid=0x00007fef28001800 nid=0x1dbf waiting for monitor entry [0x00007fef9eec7000]
java.lang.Thread.State: BLOCKED (on object monitor)
at sun.nio.ch.EPollSelectorImpl.doSelect(Unknown Source)
- waiting to lock <0x00000000c01f1af8> (a java.lang.Object)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(Unknown Source)
- locked <0x00000000c01d9420> (a io.netty.channel.nio.SelectedSelectionKeySet)
- locked <0x00000000c01f1948> (a java.util.Collections$UnmodifiableSet)
- locked <0x00000000c01d92c0> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(Unknown Source)
at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:635)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:319)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
at java.lang.Thread.run(Unknown Source)
Generate client is accomplished by using a thread pool, and has set up a connection timeout, but why frequent connection timeout? Is to serve the cause of the suit?
public void run() {
System.out.println(tnum + " connecting...");
try {
Bootstrap bootstrap = new Bootstrap();
bootstrap.group(group)
.channel(NioSocketChannel.class)
.option(ChannelOption.CONNECT_TIMEOUT_MILLIS, 30000)
.handler(loadClientInitializer);
// Start the connection attempt.
ChannelFuture future = bootstrap.connect(host, port);
future.channel().attr(AttrNum).set(tnum);
future.sync();
if (future.isSuccess()) {
System.out.println(tnum + " login success.");
goSend(tnum, future.channel());
} else {
System.out.println(tnum + " login failed.");
}
} catch (Exception e) {
XLog.error(e);
} finally {
// group.shutdownGracefully(); }
}
回答1:
High CPU has something to do with this?
It might be. I'd diagnose this problem following way (on a Linux box):
Find threads which are eating CPU
Using pidstat
I'd find which threads are eating CPU and in what mode (user/kernel) time is spent.
$ pidstat -p [java-process-pid] -tu 1 | awk '$9 > 50'
This command shows threads eating at least 50% of CPU time. You can inspect what those threads are doing using jstack
, VisualVM or Java Flight Recorder.
If CPU-hungry threads and BLOCKED threads are the same, CPU usage seems to do something with contention.
Find reason for connection timeout
Basically you will get connection timeout if two OS'es can't finish TCP-handshake in a given time. Several reasons for this:
- network link saturation. Can be diagnosed using
sar -n DEV 1
and comparingrxkB/s
andtxkB/s
columns to your link maximum throughput. - server (Netty) doesn't respond with
accept()
call in given timeout. This thread can be BLOCKED or starving for CPU time. You can find which threads are callingaccept()
(therefore finishing TCP-handshake) usingstrace -f -e trace=accept -p [java-pid]
. And after that check for possible reasons usingpidstat
/jstack
.
Also you can find number of received requests for connection open (but not confirmed) with netstat -an | grep -c SYN_RECV
回答2:
If you can elaborate more on what your Netty is doing it could be helpful. Regardless - please make sure you are closing the channels. Notice from the Channel javadoc:
It is important to call close() or close(ChannelPromise) to release all resources once you are done with the Channel. This ensures all resources are released in a proper way, i.e. filehandles
If you are closing the channels, then the problem may be with the logic it self - running into infinite loops or similar - which may be able to explain the high CPU.
来源:https://stackoverflow.com/questions/18120634/selectorimpl-is-blocked