Stale IP Address Cache

Description

If you have 2 nodes with a given IP address, and then delete one (for example, because you're in the middle of migrating to a new provisioning requisition setup), then all new snmp trap events will not be connected to that node until OpenNMS is restarted.

@jesse suggests this may be due to issues with a cache that is only rebuilt on restart:

hmm, yeah I see how it could happen, it only stores a single entry for (location, ip address), so if there are duplicates, and then one is removed, the cache won't have any entries for that address until it is rebuilt, which only happens at startup

Acceptance / Success Criteria

None

Lucidchart Diagrams

Activity

Show:

Jesse White October 11, 2017 at 6:05 PM

PR: https://github.com/OpenNMS/opennms/pull/1699

fooker September 21, 2017 at 11:51 AM

Yea, working on it. looks like multi node support seems doable.

Benjamin Reed September 19, 2017 at 2:10 PM

@fooker, are you taking this? Otherwise I was gonna work on it today...

Benjamin Reed September 19, 2017 at 2:00 PM

Making it multi-node aware is probably a bit nasty; seems like doing a lookup on cache miss (and maybe caching negative values for a bit) would be the easiest way.

Jesse White September 13, 2017 at 3:14 PM

A few ways we can fix this:

Periodically re-synchronize the cache with the database contents
Make the cache aware that multiple nodes can have the same IP address, and remove the correct when an interface is deleted
Perform a database lookup on cache miss

Fixed

Details
Assignee
fooker
Reporter
Mike Kelly
Sprint
None
Fix versions
21.0.0
Meridian-2017.1.1
Affects versions
20.0.2
Priority
Minor

PagerDuty

Created September 6, 2017 at 5:32 PM

Updated October 11, 2017 at 6:05 PM

Resolved October 11, 2017 at 5:38 PM

Stale IP Address Cache

Description

Acceptance / Success Criteria

Lucidchart Diagrams

Activity

Jesse White October 11, 2017 at 6:05 PM

fooker September 21, 2017 at 11:51 AM

Benjamin Reed September 19, 2017 at 2:10 PM

Benjamin Reed September 19, 2017 at 2:00 PM

Jesse White September 13, 2017 at 3:14 PM

Details
Assignee
fooker
Reporter
Mike Kelly
Sprint
None
Fix versions
21.0.0
Meridian-2017.1.1
Affects versions
20.0.2
Priority
Minor

Details

Assignee

Reporter

Sprint

Fix versions

Affects versions

Priority

PagerDuty

PagerDuty

Flag notifications

Something's gone wrong

Something's gone wrong

Stale IP Address Cache

Description

Acceptance / Success Criteria

Lucidchart Diagrams

Activity

Jesse White October 11, 2017 at 6:05 PM

fooker September 21, 2017 at 11:51 AM

Benjamin Reed September 19, 2017 at 2:10 PM

Benjamin Reed September 19, 2017 at 2:00 PM

Jesse White September 13, 2017 at 3:14 PM

DetailsAssigneefookerfookerReporterMike KellyMike KellySprintNone+4Fix versions21.0.0Meridian-2017.1.1Affects versions20.0.2PriorityMinor

Details

Assignee

Reporter

Sprint

Fix versions

Affects versions

Priority

PagerDutyPagerDuty Incident

PagerDuty

Flag notifications

Something's gone wrong

Something's gone wrong

Details
Assignee
fooker
Reporter
Mike Kelly
Sprint
None
Fix versions
21.0.0
Meridian-2017.1.1
Affects versions
20.0.2
Priority
Minor

PagerDuty