High CPU usage due to DataCollectionConfigDao.getConfiguredResourceTypes() while Collectd starts
Description
Analyzing a big installation of OpenNMS (~ 8000 nodes, ~ 45000 monitored services), I've discovered thanks to a JFR that I took 1 hour after starting OpenNMS that the "hot method" (or the most called and/or expensive method,) was DataCollectionConfigDao.getConfigured ResourceTypes().
When OpenNMS starts, Collectd is going to schedule data collection for all the nodes, and of course that involves calling that method.
It is necessary to accelerate that method, for a similar reason to the problem reported on NMS-6748.
Acceptance / Success Criteria
None
Lucidchart Diagrams
Activity
Show:
Alejandro Galue July 25, 2014 at 11:56 AM
Edited
Fixed on revision 5d05491a829a3ad4d9986989878e2a11e0c176f1 for 1.12.
Merged into master on revision 85b394b8fde0af8a31c69505a4592879faac9853
Alejandro Galue July 25, 2014 at 11:53 AM
The proposed solution is to create a cache for the resource types on a HashMap.
This cache is going to be initialized every time the datacolelction-config.xml is changed, and it is going to do all the expensive operations just once.
I've created a test on which I call this expensive method 1000 times. Without any code changes, the test takes 373832 us to finish, while with the code improvements it takes 581 us (that is more than 600 times faster).
Analyzing a big installation of OpenNMS (~ 8000 nodes, ~ 45000 monitored services), I've discovered thanks to a JFR that I took 1 hour after starting OpenNMS that the "hot method" (or the most called and/or expensive method,) was DataCollectionConfigDao.getConfigured ResourceTypes().
When OpenNMS starts, Collectd is going to schedule data collection for all the nodes, and of course that involves calling that method.
It is necessary to accelerate that method, for a similar reason to the problem reported on NMS-6748.