Details
-
Bug
-
Status: Resolved (View Workflow)
-
Minor
-
Resolution: Fixed
-
28.0.2
-
None
-
Security Level: Default (Default Security Scheme)
-
28.0.2 + 27b561bf527bcff69155551503d5058c238fc52 cherry-picked
-
5
-
Horizon 2021 - Sep 1 - 15, Horizon 2021 - Oct 13 - 27
-
Backlog
-
741
-
Description
Some of our minions will process flows fine for a few hours or days, then suddenly stop.
When I tail their logs, there are many messages saying "Invalid packet: null"
Given this bundle list:
354 x Active x 80 x 28.0.2 x OpenNMS :: Features :: Telemetry :: API 355 x Active x 80 x 28.0.2 x OpenNMS :: Features :: Telemetry :: Common 356 x Active x 80 x 28.0.2 x OpenNMS :: Features :: Telemetry :: Config :: API 357 x Active x 80 x 28.0.2 x OpenNMS :: Features :: Telemetry :: Config :: JAXB 358 x Active x 80 x 28.0.2 x OpenNMS :: Features :: Telemetry :: Distributed :: Common 359 x Active x 80 x 28.0.2 x OpenNMS :: Features :: Telemetry :: Distributed :: Minion 360 x Active x 80 x 28.0.2 x OpenNMS :: Features :: Telemetry :: Listeners 361 x Active x 80 x 28.0.2 x OpenNMS :: Features :: Telemetry :: Protocols :: BMP :: Parser 362 x Active x 80 x 28.0.2 x OpenNMS :: Features :: Telemetry :: Protocols :: BMP :: Transport 363 x Active x 80 x 28.0.2 x OpenNMS :: Features :: Telemetry :: Protocols :: Common 364 x Active x 80 x 28.0.2 x OpenNMS :: Features :: Telemetry :: Protocols :: Netflow :: Parser 365 x Active x 80 x 28.0.2 x OpenNMS :: Features :: Telemetry :: Protocols :: Netflow :: Transport 366 x Active x 80 x 28.0.2 x OpenNMS :: Features :: Telemetry :: Protocols :: SFlow :: Parser 367 x Active x 80 x 28.0.2 x OpenNMS :: Features :: Telemetry :: Registry 368 x Active x 80 x 28.0.2 x OpenNMS :: Features :: Telemetry :: Shell
Restarting bundles 358, 360, 364, and 365 has no effect on this issue.
Restarting bundle 359 (OpenNMS :: Features :: Telemetry :: Distributed :: Minion) does allow flow processing to resume.
All of the minions were restarted on Friday (8/20).
This morning (8/23), there were 168 in this state.
In addition, this code leads me to believe that at DEBUG log level, the entire exception should be written to the log, but this does not appear to be the case.
Attached are debug logs, packet captures, thread dumps.
Attachments
Issue Links
- mentioned in
-
Page Loading...