Uncaught exception in HostResourceSwRunMonitor when handling empty strings
Description
If the HostResourceSwRunMonitor is configured with an alternative "service-name-oid" parameter that points to e.g. hrSwRunParameters (.1.3.6.1.2.1.25.4.2.1.5) and it comes across any processes with zero-length parameter strings, the monitor will fail to catch an IndexOutOfBoundsException coming from the antlr StringUtils class. Here is an illustrative exception stack trace from poller.log (see https://mynms.opennms.com/Ticket/Display.html?id=2207):
2013-05-06 10:26:51,186 DEBUG [PollerScheduler-30 Pool-fiber3] HostResourceSwRunMonitor: HostResourceSwRunMonitor: Unexpected exception during SNMP poll of interface 192.168.200.23 java.lang.StringIndexOutOfBoundsException: String index out of range: -1 at java.lang.String.substring(String.java:1937) at antlr.StringUtils.stripFrontBack(StringUtils.java:83) at org.opennms.netmgt.poller.monitors.HostResourceSwRunMonitor.stripExtraQuotes(HostResourceSwRunMonitor.java:300) at org.opennms.netmgt.poller.monitors.HostResourceSwRunMonitor.poll(HostResourceSwRunMonitor.java:247) at org.opennms.netmgt.poller.pollables.LatencyStoringServiceMonitorAdaptor.poll(LatencyStoringServiceMonitorAdaptor.java:104) at org.opennms.netmgt.poller.pollables.PollableServiceConfig.poll(PollableServiceConfig.java:109) at org.opennms.netmgt.poller.pollables.PollableService.poll(PollableService.java:178) at org.opennms.netmgt.poller.pollables.PollableElement.poll(PollableElement.java:292) at org.opennms.netmgt.poller.pollables.PollableContainer$5.run(PollableContainer.java:305
And the corresponding service and monitor definition, slightly sanitized:
Fixed on revision 72971b0c0acddb76c62b0499c0b547e9baccb743 for 1.12.
The customer has verified that the solution works as expected.
Jeff Gehlbach February 7, 2014 at 12:35 PM
Looks good to me, thanks!
Alejandro Galue February 7, 2014 at 12:15 PM
I'm proposing the following fix:
The customer is willing to test if the solution works on his environment because reproduce it with a JUnit test and the Mock SNMP Agent is not easy (because this happens on a bad implementation of the HOST-RESOURCES-MIB).
Alejandro Galue February 7, 2014 at 12:13 PM
A customer is having the same issue running OpenNMS 1.12.3.
Checking the code, I saw that it still assumes that the SnmpValue for the service name and the status are not null objects, which unfortunately is not true for some SNMP Agents.
If the HostResourceSwRunMonitor is configured with an alternative "service-name-oid" parameter that points to e.g. hrSwRunParameters (.1.3.6.1.2.1.25.4.2.1.5) and it comes across any processes with zero-length parameter strings, the monitor will fail to catch an IndexOutOfBoundsException coming from the antlr StringUtils class. Here is an illustrative exception stack trace from poller.log (see https://mynms.opennms.com/Ticket/Display.html?id=2207):
2013-05-06 10:26:51,186 DEBUG [PollerScheduler-30 Pool-fiber3] HostResourceSwRunMonitor: HostResourceSwRunMonitor: Unexpected exception during SNMP poll of interface 192.168.200.23
java.lang.StringIndexOutOfBoundsException: String index out of range: -1
at java.lang.String.substring(String.java:1937)
at antlr.StringUtils.stripFrontBack(StringUtils.java:83)
at org.opennms.netmgt.poller.monitors.HostResourceSwRunMonitor.stripExtraQuotes(HostResourceSwRunMonitor.java:300)
at org.opennms.netmgt.poller.monitors.HostResourceSwRunMonitor.poll(HostResourceSwRunMonitor.java:247)
at org.opennms.netmgt.poller.pollables.LatencyStoringServiceMonitorAdaptor.poll(LatencyStoringServiceMonitorAdaptor.java:104)
at org.opennms.netmgt.poller.pollables.PollableServiceConfig.poll(PollableServiceConfig.java:109)
at org.opennms.netmgt.poller.pollables.PollableService.poll(PollableService.java:178)
at org.opennms.netmgt.poller.pollables.PollableElement.poll(PollableElement.java:292)
at org.opennms.netmgt.poller.pollables.PollableContainer$5.run(PollableContainer.java:305
And the corresponding service and monitor definition, slightly sanitized:
<service name="AFEED00-TOMCAT" interval="300000"
user-defined="false" status="on">
<parameter key="retry" value="1"/>
<parameter key="timeout" value="3000"/>
<parameter key="service-name" value="-Djava.util.logging.config.file=/usr/local/example/apache-t"/>
<parameter key="service-name-oid" value=".1.3.6.1.2.1.25.4.2.1.5"/>
</service>
<monitor service="AFEED00-TOMCAT" class-name="org.opennms.netmgt.poller.monitors.HostResourceSwRunMonitor"/>