Unknown NullPointerException on Pollerd related with Category Membership events

Description

I have a small lab that I've prepared to reproduce https://opennms.atlassian.net/browse/NMS-7025#icft=NMS-7025, and I discovered the following exceptions:

2014-10-17 11:26:35,532 WARN [Poller:PollerEventProcessor-Thread] o.o.n.e.EventIpcManagerDefaultImpl: run: an unexpected error occured during ListenerThread Poller:PollerEventProcessor java.lang.NullPointerException at org.opennms.netmgt.poller.PollerEventProcessor.serviceReschedule(PollerEventProcessor.java:599) ~[opennms-services-14.0.0-SNAPSHOT.jar:?] at org.opennms.netmgt.poller.PollerEventProcessor.onEvent(PollerEventProcessor.java:585) ~[opennms-services-14.0.0-SNAPSHOT.jar:?] at org.opennms.netmgt.eventd.EventIpcManagerDefaultImpl$EventListenerExecutor$2.run(EventIpcManagerDefaultImpl.java:178) [opennms-services-14.0.0-SNAPSHOT.jar:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [?:1.7.0_67] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [?:1.7.0_67] at org.opennms.core.concurrent.LogPreservingThreadFactory$2.run(LogPreservingThreadFactory.java:106) [opennms-util-14.0.0-SNAPSHOT.jar:?] at java.lang.Thread.run(Thread.java:745) [?:1.7.0_67]
2014-10-17 11:26:42,190 ERROR [Poller:PollerEventProcessor-Thread] o.o.n.p.p.PollableElement: Unexpected exception: null java.lang.NullPointerException at org.opennms.netmgt.poller.pollables.PollableService$1.run(PollableService.java:430) ~[opennms-services-14.0.0-SNAPSHOT.jar:?] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[?:1.7.0_67] at org.opennms.netmgt.poller.pollables.PollableElement.withTreeLock(PollableElement.java:264) [opennms-services-14.0.0-SNAPSHOT.jar:?] at org.opennms.netmgt.poller.pollables.PollableElement.withTreeLock(PollableElement.java:250) [opennms-services-14.0.0-SNAPSHOT.jar:?] at org.opennms.netmgt.poller.pollables.PollableElement.withTreeLock(PollableElement.java:228) [opennms-services-14.0.0-SNAPSHOT.jar:?] at org.opennms.netmgt.poller.pollables.PollableService.delete(PollableService.java:433) [opennms-services-14.0.0-SNAPSHOT.jar:?] at org.opennms.netmgt.poller.PollerEventProcessor.serviceReschedule(PollerEventProcessor.java:629) [opennms-services-14.0.0-SNAPSHOT.jar:?] at org.opennms.netmgt.poller.PollerEventProcessor.onEvent(PollerEventProcessor.java:585) [opennms-services-14.0.0-SNAPSHOT.jar:?] at org.opennms.netmgt.eventd.EventIpcManagerDefaultImpl$EventListenerExecutor$2.run(EventIpcManagerDefaultImpl.java:178) [opennms-services-14.0.0-SNAPSHOT.jar:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [?:1.7.0_67] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [?:1.7.0_67] at org.opennms.core.concurrent.LogPreservingThreadFactory$2.run(LogPreservingThreadFactory.java:106) [opennms-util-14.0.0-SNAPSHOT.jar:?] at java.lang.Thread.run(Thread.java:745) [?:1.7.0_67]

It is not clear if both messages are related with the same problem or if they are unrelated issues, and it is not telling me where is the problem.

I'm attaching the full poller.log for reference. If the configuration and/or a dump of the DB is required, please let me know.

Acceptance / Success Criteria

None

Attachments

2
  • 21 Oct 2014, 05:50 PM
  • 17 Oct 2014, 03:57 PM

Lucidchart Diagrams

Activity

Benjamin Reed October 27, 2014 at 4:42 PM

I also changed the nodeGainedServiceHandler to check if the service is already being polled, since there's a possibility we're getting events out-of-order and that's why we get the "changed" event before the service is being polled.

Benjamin Reed October 27, 2014 at 4:35 PM

Yup, the code was assuming that if it got a "category changed" event, that the node would already be polled, but this is not always the case.

I fixed the code to handle this case properly by null-checking the "polled node" object we pull out of the Poller and skipping any of the "db-vs-poller" delta-handling when there's no existing polled node.

Alejandro Galue October 24, 2014 at 1:44 PM

The problem seems to happen when adding a new node to a requisition with a category associated with it.

Alejandro Galue October 24, 2014 at 12:37 PM

I've updated the title of the issue to reflect the problem.

The VMWare nodes have categories associated with them. The NPE is associated with the Pollerd's code that handles the event nodeCategoryMembershipChanged.

Alejandro Galue October 21, 2014 at 5:50 PM

poller-20141021.log.gz contains the logs from today's test, showing the NPE.

Fixed

Details

Assignee

Reporter

Fix versions

Priority

PagerDuty

Created October 17, 2014 at 3:57 PM
Updated May 11, 2015 at 2:49 PM
Resolved October 27, 2014 at 4:35 PM

Flag notifications