Jetty HTTPS selectors can become unresponsive following CancelledKeyException

Description

After OpenNMS has been running for a while (around 24 hours in the reporting environment) the web UI stops accepting new TCP connections. The rest of the system is unaffected. The following exception stack trace appears in jetty-server.log:

I've tracked the likely cause to a Jetty bug which was fixed after the release of Jetty 9.4.14. Updating our top-level POM's jettyVersion property to 9.4.18 should take care of the problem. Note that we also ship Jetty 9.4.12 in the system directory; I have not tracked down whether that's the result of a direct or an indirect dependency, but the Jetty in the OPENNMS_HOME/lib directory appears to be one that needs upgrading at minimum.

Environment

See https://mynms.opennms.com/Ticket/Display.html?id=6074

Acceptance / Success Criteria

None

Lucidchart Diagrams

Activity

Show:

Benjamin Reed May 20, 2019 at 1:49 PM

Since 2018 is also on the same Jetty version I'm gonna try updating it in foundation-2018 first.

Benjamin Reed May 17, 2019 at 3:44 PM

PR here: https://github.com/OpenNMS/opennms/pull/2510

I'm honestly not sure if we end up using the Karaf Jetty definitions in The Real World since everything goes through our proxy, but in case we do, I changed the Karaf feature XML to use ${jettyVersion} instead.

Fixed

Details

Assignee

Reporter

Labels

Components

Sprint

Affects versions

Priority

PagerDuty

Created May 16, 2019 at 6:19 PM
Updated May 21, 2019 at 6:53 PM
Resolved May 21, 2019 at 6:53 PM