Details
-
Type:
Bug
-
Status:
Resolved
-
Priority:
Critical
-
Resolution: Fixed
-
Affects Version/s: 1.9.91
-
Fix Version/s: 1.9.92
-
Component/s: Web UI - General
-
Security Level: Default (Default Security Scheme)
-
Environment:CentOS 5.6 64-bit
OpenNMS 1.9.91
Oracle JDK 6u26
20 managed nodes
Description
In my 1.9.91 install with an SSL-enabled Jetty, I keep logging on to the box to find load averages in the 20-50 range and the server brought to its knees.
I managed to catch thread dumps of this happening twice, once on 9/16 (~23 load avg, all from java process) and once on 9/19 (~53 load avg, all from java process). Both thread dumps are attached. I'm no expert at reading thread dumps, but I believe this is a sample hung thread:
"qtp52888755-307799" prio=10 tid=0x00002aaabb96b000 nid=0xe18 runnable [0x000000004f4db000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at com.sun.net.ssl.internal.ssl.InputRecord.readFully(InputRecord.java:293)
at com.sun.net.ssl.internal.ssl.InputRecord.read(InputRecord.java:331)
at com.sun.net.ssl.internal.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:798)
- locked <0x000000050cbd8580> (a java.lang.Object)
at com.sun.net.ssl.internal.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:755)
at com.sun.net.ssl.internal.ssl.AppInputStream.read(AppInputStream.java:75)
- locked <0x000000050cbd8620> (a com.sun.net.ssl.internal.ssl.AppInputStream)
at org.eclipse.jetty.io.ByteArrayBuffer.readFrom(ByteArrayBuffer.java:388)
at org.eclipse.jetty.io.bio.StreamEndPoint.fill(StreamEndPoint.java:132)
at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.fill(SocketConnector.java:209)
at org.eclipse.jetty.server.ssl.SslSocketConnector$SslConnectorEndPoint.fill(SslSocketConnector.java:612)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:289)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:214)
at org.eclipse.jetty.server.HttpConnection.handle(HttpConnection.java:411)
at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:241)
at org.eclipse.jetty.server.ssl.SslSocketConnector$SslConnectorEndPoint.run(SslSocketConnector.java:664)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:529)
at java.lang.Thread.run(Thread.java:662)
I haven't yet narrowed down what triggers the hung threads, but the problem seems to occur whenever we use the web UI a lot (for example, when we're chasing a complex performance problem on a WAN link and are looking at lots of graphs on various nodes). If we just leave the web UI alone and nobody touches it, the problem doesn't seem to pop up much.
I managed to catch thread dumps of this happening twice, once on 9/16 (~23 load avg, all from java process) and once on 9/19 (~53 load avg, all from java process). Both thread dumps are attached. I'm no expert at reading thread dumps, but I believe this is a sample hung thread:
"qtp52888755-307799" prio=10 tid=0x00002aaabb96b000 nid=0xe18 runnable [0x000000004f4db000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at com.sun.net.ssl.internal.ssl.InputRecord.readFully(InputRecord.java:293)
at com.sun.net.ssl.internal.ssl.InputRecord.read(InputRecord.java:331)
at com.sun.net.ssl.internal.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:798)
- locked <0x000000050cbd8580> (a java.lang.Object)
at com.sun.net.ssl.internal.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:755)
at com.sun.net.ssl.internal.ssl.AppInputStream.read(AppInputStream.java:75)
- locked <0x000000050cbd8620> (a com.sun.net.ssl.internal.ssl.AppInputStream)
at org.eclipse.jetty.io.ByteArrayBuffer.readFrom(ByteArrayBuffer.java:388)
at org.eclipse.jetty.io.bio.StreamEndPoint.fill(StreamEndPoint.java:132)
at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.fill(SocketConnector.java:209)
at org.eclipse.jetty.server.ssl.SslSocketConnector$SslConnectorEndPoint.fill(SslSocketConnector.java:612)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:289)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:214)
at org.eclipse.jetty.server.HttpConnection.handle(HttpConnection.java:411)
at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:241)
at org.eclipse.jetty.server.ssl.SslSocketConnector$SslConnectorEndPoint.run(SslSocketConnector.java:664)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:529)
at java.lang.Thread.run(Thread.java:662)
I haven't yet narrowed down what triggers the hung threads, but the problem seems to occur whenever we use the web UI a lot (for example, when we're chasing a complex performance problem on a WAN link and are looking at lots of graphs on various nodes). If we just leave the web UI alone and nobody touches it, the problem doesn't seem to pop up much.