Uploaded image for project: 'OpenNMS'
  1. OpenNMS
  2. NMS-4861

Runaway threads consuming CPU when rendering certain graphs

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 1.9.90
    • Fix Version/s: 1.9.91
    • Component/s: Data Output - RRD
    • Security Level: Default (Default Security Scheme)
    • Labels:
    • Environment:
      OpenNMS 1.9.90 snapshot from 2011-08-07
      CentOS 5.5
      Dedicated Dell PE M610 w/12 cores of Xeon X5650 and 48 GB RAM running OpenNMS + PostgreSQL 9.0
      Oracle Java JDK 6u26
      19 managed nodes

      Description

      When attempting to display a specific graph I've created for Netscreen memory utilization, I see the following symptoms:

      • Page renders everything but the busted graph, and browser shows that the page is still loading, seemingly forever (at least a few minutes)
      • Reloading the page does not help, and in fact 'top' shows that Java is eating a single CPU's worth of resources for each time I reload and trigger another hung/spinning thread
      • Deleting the .jrb file and allowing it to be recreated also does not help

      I restarted OpenNMS and then triggered the condition via page reload, taking thread dumps at 1, 2, 3, 4, 5, 10, 15, and 20 hung threads. That data is attached in the "thread dumps" zipfile.

      I have also attached the corresponding .jrb files for three Netscreen nodes that all exhibit this behavior. I'm using store by group and have included the ds.properties file, as well as the graph definition (report.juniper.netscreen.host.memory is the glitched graph).

        Attachments

          Activity

            People

            • Assignee:
              desloge Donald Desloge
              Reporter:
              andye@narsnet.com Andy Ellsworth
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: