The logic to find event definitions confuses the Event Translator when translating SNMP Traps

Description

Eventd, the daemon responsible for handling events, has received several changes to enhance its performance especially when choosing the correct event definition to associate with an incoming event.

On the other hand, the Event Translator can be used to generate other events based on the information from the incoming events. The Event Translator generates the translated events using a cloned version of the incoming events, overriding some fields based on the translation specification.

Because the cloned events preserve the SNMP data, Eventd is not going to pick the correct definition.

A user would expect that Eventd will use the UEI configured on translator-configuration.xml to pick the correct event definition for the translated event. Because the translated event contains the SNMP data associated with the incoming trap, Eventd, will use that instead to match the event definition which has the unwanted effect of having an exact copy of the original event with a different UEI, plus additional changes depending on how the translation has been configured.

Let's say you have an SNMP Trap that generates an alarm, and you want to also use that trap to set the status of a service using the Passive Status Keeper. The Event Translator seems to be the perfect solution for this. The problem is that the translated event will have the UEI of the passiveServiceStatus, the expected parameters, but all the rest will come from the definition of the original trap, including the alarm-data, which is not expected and will have the undesired effect to have duplicate alarms (with different UEIs) that have to be manually cleared.

For this reason, I think that we should remove the SNMP object from the cloned event to guarantee that Eventd will use the UEI to find the correct event definition for the translated event.

If there are some corner cases on which a user requires to preserve the SNMP data on the cloned event for some reason, we can add an optional attribute to the mapping definition inside the event-translation-spec to keep the SNMP data, for example:

Acceptance / Success Criteria

None

Attachments

2

Lucidchart Diagrams

Activity

Show:

Alejandro Galue April 26, 2016 at 4:34 PM

Alejandro Galue April 26, 2016 at 2:07 PM

, I always use GIT and PR. The patch was for referential purposes only, that's how I work. You can ignore them on any issue I create and just pay attention to GIT.

Alejandro Galue April 26, 2016 at 2:02 PM
Edited

Alternative for people using older versions of OpenNMS:

For those users who are experimenting this problem and cannot upgrade OpenNMS, the following procedure explains how to use Scriptd to replace the logic performed by the ET, in order to send a passiveServiceStatus event:

scriptd-configuration.xml for Horizon 17, Meridian 2016 or newer

passive-status-handler.bsh for Horizon 17, Meridian 2016 or newer

I omitted the sample for the "Up" event for simplicity, but it should be something similar. You just need to know the event that triggers the UP, and then call:

Ronny Trommer April 26, 2016 at 1:32 PM

Suggestion, branch and PR instead of a patch file? Hard to know and to maintain patch files unknown if the patch needs to be applied to develop, foundation-2016, release-18.0.0? We should use the benefits of git and github.

Alejandro Galue April 26, 2016 at 12:24 PM

If I apply the attached patch and follow the steps to reproduce the problem, I can see that the passiveServiceStatus is now properly generated:

The solution was tested on a branch created from foundation-2016 designed for this issue.

Fixed

Details

Assignee

Reporter

Components

Affects versions

Priority

PagerDuty

Created April 26, 2016 at 11:18 AM
Updated June 28, 2016 at 4:18 PM
Resolved April 27, 2016 at 1:47 PM