Details
-
Bug
-
Status: Resolved (View Workflow)
-
Critical
-
Resolution: Fixed
-
24.0.0
-
Security Level: Default (Default Security Scheme)
-
Horizon 2019 - May 15th 2019
Description
After OpenNMS has been running for a while (around 24 hours in the reporting environment) the web UI stops accepting new TCP connections. The rest of the system is unaffected. The following exception stack trace appears in jetty-server.log:
2019-05-07 06:01:06,759 WARN [qtp92175805-91815] o.e.j.u.t.s.EatWhatYouKill: java.nio.channels.CancelledKeyException: null at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:71) ~[?:?] at sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:130) ~[?:?] at org.eclipse.jetty.io.ManagedSelector$SelectorProducer.processSelected(ManagedSelector.java:477) ~[jetty-io-9.4.14.v20181114.jar:9.4.14.v20181114] at org.eclipse.jetty.io.ManagedSelector$SelectorProducer.produce(ManagedSelector.java:352) ~[jetty-io-9.4.14.v20181114.jar:9.4.14.v20181114] at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produceTask(EatWhatYouKill.java:357) [jetty-util-9.4.14.v20181114.jar:9.4.14.v20181114] at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:181) [jetty-util-9.4.14.v20181114.jar:9.4.14.v20181114] at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168) [jetty-util-9.4.14.v20181114.jar:9.4.14.v20181114] at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126) [jetty-util-9.4.14.v20181114.jar:9.4.14.v20181114] at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366) [jetty-util-9.4.14.v20181114.jar:9.4.14.v20181114] at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:765) [jetty-util-9.4.14.v20181114.jar:9.4.14.v20181114] at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:683) [jetty-util-9.4.14.v20181114.jar:9.4.14.v20181114] at java.lang.Thread.run(Thread.java:834) [?:?]
I've tracked the likely cause to a Jetty bug which was fixed after the release of Jetty 9.4.14. Updating our top-level POM's jettyVersion property to 9.4.18 should take care of the problem. Note that we also ship Jetty 9.4.12 in the system directory; I have not tracked down whether that's the result of a direct or an indirect dependency, but the Jetty in the OPENNMS_HOME/lib directory appears to be one that needs upgrading at minimum.