Evaluate flow-related Elasticsearch query performance

Description

The REST API endpoints used by the Flow Datasource in Helm issue several many queries to Elasticsearch. We have identified cases where the REST API takes a long time to respond (30+ minutes), but should work to isolate this further.

Acceptance / Success Criteria

None

Lucidchart Diagrams

Activity

Jesse White March 9, 2020 at 7:20 PM

Based on a hour period from a production environment, we have the following metrics:

  • Number of unique exporters: 800

  • Number unique src address: 120069

  • Number unique dst address: 131268

  • Number of unique applications: 5901

  • Number of flows per hour for a particular interface: 6400794

Time taken to determine the Top N applications for a specific application over a given period:

  • 1 hour - 534 ms

  • 2 hour - 1100 ms

  • 4 hours - 2441 ms

  • 24 hours - 11236 ms

To figure out the Top N application, we have a filter query, a group by aggregation and a sum - our plugin is not being used.

Extrapolating the timing from the Top N query to 30 days, this query would take at least 5 minutes.

Fixed

Details

Assignee

Reporter

Sprint

Priority

PagerDuty

Created February 19, 2020 at 6:01 AM
Updated March 9, 2020 at 7:22 PM
Resolved March 9, 2020 at 7:22 PM