I have some services that are having frequent, small SNMP outages (down for ~30 seconds), and not all of the notifications are getting auto-acked. So, despite having a 2m delay on the notification path, I'm getting notified about issues that are already resolved, and not ever being told that they are resolved.
An example of a recent outage's timing:
But, the notification related to that (linked to the same outage event):
|Sent To||Sent At||Media||Contact Info|
So, it didn't get acked at 10:06:07, so I got an email at 10:07:51, even though the outage was already resolved.
My config for notifications:
- Using auto-acknowledge-alarm:
- Default queue handler stuff:
- Destination path:
Right now, nothing shows up for this whole month in notifd.log (I assume the default logging level isn't going to show me anything), but I do see this in alarmd for the specific alarm in question:
2020-07-08 10:10:31,613 WARN [alarmd-Thread-4-of-4] o.o.n.a.d.DroolsAlarmContext: Failed to acquire Drools session lock within 20000ms. Add or update for alarm with id=6059035
and reduction-key=uei.opennms.org/nodes/nodeLostService::2751:10.xx.xx.xx:SNMP will not be immediately reflected in the context.