Ops wallboard hanging due to Vaadin session deadlock

Description

This issue was discovered while investigating hanging system tests.

Container logs were found to have the following error:

Exception in thread "Timer-8" java.lang.NullPointerException at org.opennms.features.vaadin.dashboard.ui.wallboard.WallboardBody$1.run(WallboardBody.java:99) at java.base/java.util.TimerThread.mainLoop(Timer.java:556) at java.base/java.util.TimerThread.run(Timer.java:506) 2019-08-16 17:22:50

Requests to the server were stuck with the following stack:

"qtp192816158-669" #669 prio=5 os_prio=0 cpu=473.16ms elapsed=338.74s tid=0x00007fe11c1c1000 nid=0x3cd waiting on condition [0x00007fe05e6e1000] java.lang.Thread.State: WAITING (parking) at jdk.internal.misc.Unsafe.park(java.base@11.0.3/Native Method) - parking to wait for <0x00000000de6fc5d8> (a java.util.concurrent.locks.ReentrantLock$NonfairSync) at java.util.concurrent.locks.LockSupport.park(java.base@11.0.3/LockSupport.java:194) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(java.base@11.0.3/AbstractQueuedSynchronizer.java:885) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(java.base@11.0.3/AbstractQueuedSynchronizer.java:917) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(java.base@11.0.3/AbstractQueuedSynchronizer.java:1240) at java.util.concurrent.locks.ReentrantLock.lock(java.base@11.0.3/ReentrantLock.java:267) at com.vaadin.server.VaadinService.lockSession(VaadinService.java:702) at com.vaadin.server.VaadinService.findOrCreateVaadinSession(VaadinService.java:738) at com.vaadin.server.VaadinService.findVaadinSession(VaadinService.java:602) at com.vaadin.server.VaadinService.handleRequest(VaadinService.java:1595) at com.vaadin.server.VaadinServlet.service(VaadinServlet.java:445) at org.opennms.vaadin.extender.internal.extender.ApplicationFactoryServiceTracker$FactoryServlet.service(ApplicationFactoryServiceTracker.java:134) at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) at org.apache.felix.http.base.internal.handler.ServletHandler.handle(ServletHandler.java:120) at

Acceptance / Success Criteria

None

Attachments

1
  • 16 Aug 2019, 11:36 PM

Lucidchart Diagrams

Activity

Show:

Christian Pape August 29, 2019 at 11:03 AM

Jesse White August 16, 2019 at 11:44 PM

Fixed in release-25.0.0 with https://github.com/OpenNMS/opennms/commit/cfd035a610fc07b7b791a8dadb0f888623c5ffbe

Keeping this open so we can evaluate whether or not this needs to be backported.

Fixed

Details

Assignee

Reporter

Sprint

Fix versions

Priority

PagerDuty

Created August 16, 2019 at 11:34 PM
Updated September 6, 2019 at 10:45 AM
Resolved September 6, 2019 at 10:45 AM

Flag notifications