Override ShutdownStrategy in Syslogd, Trapd Camel contexts

Description

We are not overriding the ShutdownStrategy in the syslog and trap contexts which means that they will use the default 300 second (5 minute) timeout when in-flight messages are still present in the Camel queues.

This causes problems if syslog and trap messages are received shortly after startup (which is likely) because Karaf is refreshing the Camel contexts. If a refresh happens, the context must be shutdown and restarted and the shutdown can experience the 5 minute timeout. This makes Minion startup take minutes and may cause Karaf feature install timeouts.

We need to override the default ShutdownStrategy with a strategy that uses reduced timeout values.

Acceptance / Success Criteria

None

Lucidchart Diagrams

Activity

Show:

Seth Leger October 5, 2016 at 10:47 AM

These changes have been merged into develop for inclusion in 19. Marking as fixed.

commit ec46d074fe044d50d056c1bed2ed8e1075d97bc3
commit 3bfddb1837e22a99fa3ae119fe43c5b49f0fadc2

Chandra Gorantla October 3, 2016 at 4:58 PM

PR: https://github.com/OpenNMS/opennms/pull/1056

Verified that trapd listener shutdown happens in 15 secs instead of 300 secs

Seth Leger October 1, 2016 at 10:09 AM
Edited

It is easy to reproduce this problem by just restarting the minion service a couple of times, eventually it will get caught trying to fetch the SNMPv3 config. I think you can also reproduce by:

  • Start up minion + opennms

  • Shutdown opennms

  • Send a few syslog or trap messages to the minion system

  • Immediately shut minion down

It should still be retrying to send the syslogs/traps to opennms (which is down) and this will cause the shutdown to take 5 minutes for each context that has the default timeout.

Fixed

Details

Assignee

Reporter

Components

Fix versions

Affects versions

Priority

PagerDuty

Created September 24, 2016 at 3:32 PM
Updated October 5, 2016 at 2:33 PM
Resolved October 5, 2016 at 10:47 AM