Details
-
Bug
-
Status: Resolved (View Workflow)
-
Blocker
-
Resolution: Fixed
-
15.0.1, Meridian-2015.1.0, 16.0.0
-
Security Level: Default (Default Security Scheme)
-
Linux Redhat, Oracle Java version 1.7.0_72
-
Finalize 16.0.1
Description
XML Datacollection is messing up RRD Storage Directory.
Side effect is that data is stored in wrong directory and it causes RRD concurrent write access.
Seen exception in log is
org.opennms.netmgt.collection.api.CollectionException: An undeclared throwable was caught during data collection for interface 27/10.200.19.12/CiscoPorts XML-Collector at org.opennms.netmgt.collectd.CollectableService.doCollection(CollectableService.java:421) ~[opennms-services-15.0.1.jar:?] at org.opennms.netmgt.collectd.CollectableService.doRun(CollectableService.java:322) [opennms-services-15.0.1.jar:?] at org.opennms.netmgt.collectd.CollectableService.access$000(CollectableService.java:70) [opennms-services-15.0.1.jar:?] at org.opennms.netmgt.collectd.CollectableService$1.run(CollectableService.java:300) [opennms-services-15.0.1.jar:?] at org.opennms.core.logging.Logging.withPrefix(Logging.java:66) [org.opennms.core.logging-15.0.1.jar:?] at org.opennms.netmgt.collectd.CollectableService.run(CollectableService.java:296) [opennms-services-15.0.1.jar:?] at org.opennms.netmgt.scheduler.LegacyScheduler$1.run(LegacyScheduler.java:209) [opennms-services-15.0.1.jar:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [?:1.7.0_72] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [?:1.7.0_72] at org.opennms.core.concurrent.LogPreservingThreadFactory$3.run(LogPreservingThreadFactory.java:124) [opennms-util-15.0.1.jar:?] at java.lang.Thread.run(Thread.java:745) [?:1.7.0_72] Caused by: java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextEntry(HashMap.java:922) ~[?:1.7.0_72] at java.util.HashMap$KeyIterator.next(HashMap.java:956) ~[?:1.7.0_72] at org.opennms.netmgt.collection.api.AttributeGroup.visit(AttributeGroup.java:107) ~[org.opennms.features.collection.api-15.0.1.jar:?] at org.opennms.netmgt.collection.support.AbstractCollectionResource.visit(AbstractCollectionResource.java:117) ~[org.opennms.features.collection.api-15.0.1.jar:?] at org.opennms.netmgt.collection.support.MultiResourceCollectionSet.visit(MultiResourceCollectionSet.java:69) ~[org.opennms.features.collection.api-15.0.1.jar:?] at org.opennms.netmgt.collectd.CollectableService.doCollection(CollectableService.java:394) ~[opennms-services-15.0.1.jar:?] ... 10 more
Exception is associated to [Collectd-Thread-41-of-50] collecting one node for XML service called CiscoPorts-XML-Collection according to log
2015-02-27 11:43:48,876 ERROR [Collectd-Thread-41-of-50] o.o.n.c.CollectableService: An undeclared throwable was caught during data collection for interface 27/10.200.19.12/CiscoPorts XML-Collector
What I can see in log is that [Collectd-Thread-27-of-50] which is also collecting one node for same service has an incorrect node to RRD directory mapping according to logs
2015-02-27 11:43:48,063 INFO [Collectd-Thread-27-of-50] o.o.n.c.CollectableService: run: starting new collection for 12/10.200.39.12/CiscoPorts XML-Collector/CiscoPorts-XML-Collection 2015-02-27 11:43:48,804 DEBUG [Collectd-Thread-27-of-50] o.o.n.c.DefaultCollectionAgent: getStorageDir: isStoreByForeignSource = false, foreignSource = null, foreignId = null, dir = 27
I attach full collectd.log taken from opennms start, Hope it can help pointing the issue.
Notes:
- Exact same configuration running in opennms 1.12.9-2 was running perfectly well.
- As far as I see only XML Datacollection is affected (No SNMP datacollection problem)