Uploaded image for project: 'OpenNMS'
  1. OpenNMS
  2. NMS-4912

JdbcCollector freeze Collectd when using Data Source Factories defined on opennms-datasources.xml instead of using their own connections.

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Resolved (View Workflow)
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 1.8.13, 1.9.90
    • Fix Version/s: None
    • Component/s: Data Collection - JDBC
    • Security Level: Default (Default Security Scheme)
    • Labels:

      Description

      This issue is related with the support Ticket 616.

      Currently there are two ways to configure the JDBC Collector:

      1) Declare the external data-sources inside opennms-datasources.xml and let the DB Pool Manager manages the connections.
      2) Declare the database connection directly on the service related with the JDBC Collector on collectd-configuration.xml

      After a few hours, Collectd stop working if you use the first method.

      I could reproduce the problem locally.

      Here is the current state of the collector thread after collectd is frozen:

      "CollectdScheduler-50 Pool-fiber0" prio=5 tid=101fe9000 nid=0x10d627000 in
      Object.wait() [10d626000]
         java.lang.Thread.State: WAITING (on object monitor)
          at java.lang.Object.wait(Native Method)
          at com.mchange.v2.resourcepool.BasicResourcePool.awaitAvailable(BasicResourcePool.java:1315)
          at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:557)
          - locked <7efe922b8> (a com.mchange.v2.resourcepool.BasicResourcePool)
          at com.mchange.v2.resourcepool.BasicResourcePool.checkoutResource(BasicResourcePool.java:477)
          at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutPooledConnection(C3P0PooledConnectionPool.java:525)
          at com.mchange.v2.c3p0.impl.AbstractPoolBackedDataSource.getConnection(AbstractPoolBackedDataSource.java:128)
          at org.opennms.netmgt.config.C3P0ConnectionFactory.getConnection(C3P0ConnectionFactory.java:230)
          at org.springframework.jdbc.datasource.LazyConnectionDataSourceProxy$LazyConnectionInvocationHandler.getTargetConnection(LazyConnectionDataSourceProxy.java:393)
          at org.springframework.jdbc.datasource.LazyConnectionDataSourceProxy$LazyConnectionInvocationHandler.invoke(LazyConnectionDataSourceProxy.java:366)
          at $Proxy38.getMetaData(Unknown Source)
          at org.opennms.netmgt.collectd.JdbcCollector.isGroupAvailable(JdbcCollector.java:316)
          at org.opennms.netmgt.collectd.JdbcCollector.collect(JdbcCollector.java:216)
          at org.opennms.netmgt.collectd.CollectionSpecification.collect(CollectionSpecification.java:277)
          at org.opennms.netmgt.collectd.CollectableService.doCollection(CollectableService.java:382)
          at org.opennms.netmgt.collectd.CollectableService.run(CollectableService.java:316)
          at org.opennms.netmgt.scheduler.LegacyScheduler$1.run(LegacyScheduler.java:295)
          at org.opennms.core.concurrent.RunnableConsumerThreadPool$FiberThreadImpl.run(RunnableConsumerThreadPool.java:427)
          at java.lang.Thread.run(Thread.java:680)
      

      Here's the details about the configuration which does not work:

      a) Declare a data-source on opennms-datasources.xml:

        <jdbc-data-source name="mysql-test" 
                  database-name="test_database" 
                  class-name="com.mysql.jdbc.Driver" 
                  url="jdbc:mysql://192.168.0.8:3306/test_database"
                  user-name="opennms" password="secret" />
      

      The MySQL Server is running on 192.168.0.8 and OpenNMS is running on other server (192.168.0.5)

      b) Configure the JDBC Collection using the above data-source:

          <package name="test-mysql-databases">
              <filter>IPADDR IPLIKE 192.168.0.8</filter>
              <service name="MySQL_test interval="30000" user-defined="false" status="on">
                  <parameter key="collection" value="mysql-test-collection"/>
                  <parameter key="thresholding-enabled" value="true"/>
                  <parameter key="data-source" value="mysql-test"/>
              </service>
          </package>
      

      c) Declare some queries inside jdbc-datacollection-config.xml (the details of this are not relevant)

      d) In order to reproduce the problem quickly, I've configured the poll interval to 30 seconds (remember to update jdbc-datacollection-config.xml)

      e) Provision a node with the service MySQL_test.

      After a few minutes (or hours if you let the default collection interval to 5 minutes), Collectd will freeze, and the JRBs will not be updated.

      The workaround:

      a) Remove or comment the data-sources to external databases from opennms-datasources.xml

      b) Reconfigure the package from collectd-configuration.xml to use their own connection (remember to remove the parameter data-source):

          <package name="test-mysql-databases">
              <filter>IPADDR IPLIKE 192.168.0.8</filter>
              <service name="MySQL_test interval="30000" user-defined="false" status="on">
                  <parameter key="collection" value="mysql-test-collection"/>
                  <parameter key="thresholding-enabled" value="true"/>
                  <parameter key="driver" value="com.mysql.jdbc.Driver"/>
                  <parameter key="user" value="opennms"/>
                  <parameter key="password" value="secret"/>
                  <parameter key="url" value="jdbc:mysql://OPENNMS_JDBC_HOSTNAME:3306/test_database"/>
              </service>
          </package>
      

      c) Restart OpenNMS.

        Attachments

          Activity

            People

            Assignee:
            Unassigned Unassigned
            Reporter:
            agalue Alejandro Galue
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: