CollectionResourceWrapper cache takes up large amounts of RAM
Description
On a customer machine, OpenNMS is failing after several days of operation. When we examine the heap dump, it appears that a lot of RAM is being consumed by the s_cache static cache inside CollectionResourceWrapper. This cache is used to store values so that thresholds can be evaluated as the collections occur.
After investigating the cache variable, we found out that the entries in the cache are instances of an inner class that is NOT static. This is causing references to the outside object to leak into the cache. This greatly increases the retained size of the cache (probably by 10X) and is causing lots of garbage collection on the system. This is probably the primary cause of the performance problems. It would also cause similar performance problems on any system that is doing a lot of data collection (the customer machine is collecting roughly 90,000 individual metrics).
Acceptance / Success Criteria
None
Lucidchart Diagrams
Activity
Show:
Seth Leger October 2, 2012 at 1:33 PM
This problem has been fixed by marking the inner class as a static class. Marking as fixed. The problem had already been fixed as part of an unrelated bugfix in 1.11.2 (commit dc8f7e419384cb1b1ba610e548bf4426a6b43228).
On a customer machine, OpenNMS is failing after several days of operation. When we examine the heap dump, it appears that a lot of RAM is being consumed by the s_cache static cache inside CollectionResourceWrapper. This cache is used to store values so that thresholds can be evaluated as the collections occur.
After investigating the cache variable, we found out that the entries in the cache are instances of an inner class that is NOT static. This is causing references to the outside object to leak into the cache. This greatly increases the retained size of the cache (probably by 10X) and is causing lots of garbage collection on the system. This is probably the primary cause of the performance problems. It would also cause similar performance problems on any system that is doing a lot of data collection (the customer machine is collecting roughly 90,000 individual metrics).