DCB fails on newly provisioned nodes

Description

To repro:

Provision a fresh node with metadata dcb:username, dcb:password, and dcb:script-file, e.g.:

<node foreign-id="1658852690938" node-label="test-switch-cisco"> <interface ip-addr="192.168.1.231" status="1" snmp-primary="P"> <monitored-service service-name="DeviceConfig"/> </interface> <meta-data context="requisition" key="dcb:username" value="${scv:test-switch:username}"/> <meta-data context="requisition" key="dcb:password" value="${scv:test-switch:password}"/> <meta-data context="requisition" key="dcb:script-file" value="cisco-ios-startup"/> <meta-data context="requisition" key="dcb:retention-period" value="1w"/> <meta-data context="requisition" key="dcb:schedule" value="0 0 1 * * ?"/> </node>

Sync the requisition, wait for the node to be available / wait for the uei.opennms.org/internal/provisiond/nodeScanCompleted event; then

Try to run dcb-trigger (or, for that matter, dcb-get) via Karaf shell:

admin@opennms()> opennms:dcb-trigger -v 192.168.1.231 Error executing command: java.lang.IllegalArgumentException

Karaf.log contains:

2022-08-22T19:26:41,947 | ERROR | Karaf ssh console user null | ShellUtil | 61 - org.apache.karaf.shell.core - 4.3.6 | Exception caught while executing command java.lang.IllegalArgumentException: null at java.util.Optional.orElseThrow(Optional.java:408) ~[?:?] at org.opennms.features.deviceconfig.service.impl.DeviceConfigServiceImpl.pollDeviceConfig(DeviceConfigServiceImpl.java:189) ~[?:?] at org.opennms.features.deviceconfig.service.impl.DeviceConfigServiceImpl.triggerConfigBackup(DeviceConfigServiceImpl.java:98) ~[?:?] at org.opennms.features.deviceconfig.shell.DcbTriggerCommand.execute(DcbTriggerCommand.java:78) ~[?:?] at org.apache.karaf.shell.impl.action.command.ActionCommand.execute(ActionCommand.java:84) ~[?:?] at org.apache.karaf.shell.impl.console.osgi.secured.SecuredCommand.execute(SecuredCommand.java:68) ~[?:?] at org.apache.karaf.shell.impl.console.osgi.secured.SecuredCommand.execute(SecuredCommand.java:86) ~[?:?] at org.apache.felix.gogo.runtime.Closure.executeCmd(Closure.java:599) ~[?:?] at org.apache.felix.gogo.runtime.Closure.executeStatement(Closure.java:526) ~[?:?] at org.apache.felix.gogo.runtime.Closure.execute(Closure.java:415) ~[?:?] at org.apache.felix.gogo.runtime.Pipe.doCall(Pipe.java:416) ~[?:?] at org.apache.felix.gogo.runtime.Pipe.call(Pipe.java:229) ~[?:?] at org.apache.felix.gogo.runtime.Pipe.call(Pipe.java:59) ~[?:?] at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?] at java.lang.Thread.run(Thread.java:829) [?:?]

However, if OpenNMS is restarted, this will work immediately. It would be super awesome if we didn't have to restart OpenNMS after provisioning a node we intent to use with DCB.

Acceptance / Success Criteria

None

Lucidchart Diagrams

Activity

Show:

Alexander Chadfield September 20, 2022 at 5:42 PM

The resolution was to use a single Pollar for retrieving all nodes. 

Alexander Chadfield September 2, 2022 at 2:19 PM
Edited

Also it doesn't work from WebUI with the error: Unable to trigger config backup for 192.168.68.110 at location Default with configType DeviceConfig

 

Node added through Requisition page. 

 

Jeff Gehlbach August 23, 2022 at 1:54 PM

Pulling in for eng uptake since the workaround is quite an unpleasant one.

Jesse White August 23, 2022 at 12:49 PM

I suspect this only affects the Karaf shell commands - if these were triggered by the poller they would go through.

Fixed

Details

Assignee

Reporter

HB Grooming Date

HB Backlog Status

Docs Needed

No

FD#

Story Points

Sprint

Fix versions

Affects versions

Priority

PagerDuty

Created August 22, 2022 at 11:35 PM
Updated October 11, 2022 at 1:54 PM
Resolved September 20, 2022 at 5:42 PM

Flag notifications