Uploaded image for project: 'OpenNMS'
  1. OpenNMS
  2. NMS-6412

Provisiond detectors sometimes fail to detect



    • NMS Sprint 1


      I'm finally transitioning to using provisiond for discovery but am not able to reliably detect services. On my test system I'm trying to detect services on a single node that has ping, SSH, HTTP on port 80, HTTPS on 443 and another HTTPS service on port 8443. After creating a prov req, I add an HTTPS service to the list of detectors in the foreign source definition with a port parameter of 8443, leaving the default detectors in tact. Clicking the synchronize button fires off discovery but only ICMP and SSH get detected - bumping up the logs for Provisiond to DEBUG shows that all other services are detecting false with a stack trace like this in the log:

      2014-02-18 22:30:55,536 DEBUG [NioSocketConnector-2] ConnectionFactoryNewConnectorImpl$1: Exception of type org.apache.mina.core.RuntimeIoException caught, disposing of connector: NioSocketConnector-2
      org.apache.mina.core.RuntimeIoException: Failed to get the session.
      at org.apache.mina.core.future.DefaultConnectFuture.getSession(DefaultConnectFuture.java:58)
      at org.opennms.netmgt.provision.support.ConnectionFactoryNewConnectorImpl$1.operationComplete(ConnectionFactoryNewConnectorImpl.java:147)
      at org.opennms.netmgt.provision.support.ConnectionFactoryNewConnectorImpl$1.operationComplete(ConnectionFactoryNewConnectorImpl.java:141)
      at org.apache.mina.core.future.DefaultIoFuture.notifyListener(DefaultIoFuture.java:375)
      at org.apache.mina.core.future.DefaultIoFuture.notifyListeners(DefaultIoFuture.java:365)
      at org.apache.mina.core.future.DefaultIoFuture.setValue(DefaultIoFuture.java:288)
      at org.apache.mina.core.future.DefaultConnectFuture.setException(DefaultConnectFuture.java:94)
      at org.apache.mina.core.polling.AbstractPollingIoConnector.processTimedOutSessions(AbstractPollingIoConnector.java:470)
      at org.apache.mina.core.polling.AbstractPollingIoConnector.access$800(AbstractPollingIoConnector.java:64)
      at org.apache.mina.core.polling.AbstractPollingIoConnector$Connector.run(AbstractPollingIoConnector.java:513)
      at org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      at java.lang.Thread.run(Thread.java:744)
      Caused by: java.net.ConnectException: Connection timed out.
      ... 7 more
      2014-02-18 22:30:55,537 INFO [NioSocketConnector-2] AsyncBasicDetectorMinaImpl$2: Connection exception occurred: java.net.ConnectException: Connection timed out. for service HTTP, retrying attempt 1

      However, if I remove the detectors for the services that aren't on this node, the expected serviced do get detected properly. Also, since Provisiond seems to detect services in the same order as defined in the foreign source definition, if I remove them one-by-one previously failing services start getting detected as I remove services defined before it that I know will fail. Seems like all the NioSocketConnector instances are sharing a reference to something that prevents sessions from getting created once a single attempt times out.

      Maybe somehow related to the fix for the leaking file descriptors utilizing the IoSessionInitializer callback but I don't know anything about MINA and I couldn't really find any good documentation or examples on it either.


        1. collectd-configuration.xml
          6 kB
        2. nms_6412_foreign_src_changed.tdump.gz
          9 kB
        3. nms_6412_midscan.tdump.gz
          10 kB
        4. nms_6412_midsuccess.tdump.gz
          11 kB
        5. nms_6412_postsuccess.tdump.gz
          10 kB
        6. nms_6412_prescan.tdump.gz
          9 kB
        7. nms6412-default-constructor.patch
          2 kB
        8. TestCUCMXml.xml
          2 kB
        9. TestCUCMXml.xml-req
          0.4 kB
        10. threaddump-1392919864747.tdump.gz
          29 kB
        11. xml-datacollection-config.xml
          8 kB

        Issue Links



              seth Seth Leger
              schlend David Schlenk
              6 Vote for this issue
              9 Start watching this issue