Performance of time series integration layer

Description

did performance tests where he saw that the ts layer doesn't perform as good as Newts. It should show a similar performance.

Test setup:
stress command:

Configuration of OpenNMS:

  • set debug level in log4j.xml to error

  • adjust the following settings in: etc/opennms.properties.d/ts.properties:

Acceptance / Success Criteria

None

Attachments

41

Lucidchart Diagrams

Activity

Show:

Patrick Schweizer May 18, 2022 at 6:25 PM

Overall conclusion: for the same amount of events in the ring buffer ts uses ~4x more memory.

=> closing this ticket

Freddy Chu May 17, 2022 at 9:17 PM

The last commit is with your fix. 

I have also tried once without 

event.setSamples(null);

 

The result looks similar to yours.

ring_buffer_size=131072

 

 

Freddy Chu May 17, 2022 at 2:07 AM

I am using the same config as yours. For buffer size 131072 & 262144.

It seems the result is quite difference than yours. There is one major difference is my cloud plugin is connected to a real cortex.

I am using fight recorder for the profiling. opennms Xmx 2g. Each of the test timed for 5mins only.

using karaf command: stress-metrics -n 600 -i 20 -t 8

 

It seems cortex even consume less memory and the throughput are similar. 

At last I have also attached the jfr files for more details.

 

cortex 131072

– Meters ----------------------------------------------------------------------
numeric-attributes-generated
             count = 4050500
         mean rate = 14998.91 events/second
     1-minute rate = 14998.79 events/second
     5-minute rate = 14958.66 events/second
    15-minute rate = 14925.51 events/second
string-attributes-generated
             count = 810100
         mean rate = 2999.76 events/second
     1-minute rate = 2999.76 events/second
     5-minute rate = 2991.73 events/second
    15-minute rate = 2985.10 events/second

– Timers ----------------------------------------------------------------------
batches
             count = 13
         mean rate = 0.05 calls/second
     1-minute rate = 0.05 calls/second
     5-minute rate = 0.03 calls/second
    15-minute rate = 0.01 calls/second
               min = 19978.07 milliseconds
               max = 20011.63 milliseconds
              mean = 19998.27 milliseconds
            stddev = 3.65 milliseconds
            median = 19999.75 milliseconds
              75% <= 20000.30 milliseconds
              95% <= 20002.27 milliseconds
              98% <= 20002.40 milliseconds
              99% <= 20011.63 milliseconds
            99.9% <= 20011.63 milliseconds

 

newts 131072

– Meters ----------------------------------------------------------------------
numeric-attributes-generated
             count = 4050500
         mean rate = 14997.93 events/second
     1-minute rate = 14998.79 events/second
     5-minute rate = 14958.66 events/second
    15-minute rate = 14925.51 events/second
string-attributes-generated
             count = 810100
         mean rate = 2999.51 events/second
     1-minute rate = 2999.76 events/second
     5-minute rate = 2991.73 events/second
    15-minute rate = 2985.10 events/second

– Timers ----------------------------------------------------------------------
batches
             count = 13
         mean rate = 0.05 calls/second
     1-minute rate = 0.05 calls/second
     5-minute rate = 0.03 calls/second
    15-minute rate = 0.01 calls/second
               min = 19971.69 milliseconds
               max = 20003.18 milliseconds
              mean = 19999.39 milliseconds
            stddev = 2.89 milliseconds
            median = 19999.50 milliseconds
              75% <= 19999.77 milliseconds
              95% <= 20003.18 milliseconds
              98% <= 20003.18 milliseconds
              99% <= 20003.18 milliseconds
            99.9% <= 20003.18 milliseconds

 

 

cortex 262144

 

– Meters ----------------------------------------------------------------------
numeric-attributes-generated
             count = 4050150
         mean rate = 14997.42 events/second
     1-minute rate = 14998.83 events/second
     5-minute rate = 14958.63 events/second
    15-minute rate = 14925.48 events/second
string-attributes-generated
             count = 810050
         mean rate = 2999.53 events/second
     1-minute rate = 2999.76 events/second
     5-minute rate = 2991.73 events/second
    15-minute rate = 2985.10 events/second

– Timers ----------------------------------------------------------------------
batches
             count = 13
         mean rate = 0.05 calls/second
     1-minute rate = 0.05 calls/second
     5-minute rate = 0.03 calls/second
    15-minute rate = 0.01 calls/second
               min = 19965.51 milliseconds
               max = 20039.18 milliseconds
              mean = 19998.13 milliseconds
            stddev = 25.01 milliseconds
            median = 20001.46 milliseconds
              75% <= 20002.18 milliseconds
              95% <= 20039.18 milliseconds
              98% <= 20039.18 milliseconds
              99% <= 20039.18 milliseconds
            99.9% <= 20039.18 milliseconds

 

newts 262144

– Meters ----------------------------------------------------------------------
numeric-attributes-generated
             count = 4050500
         mean rate = 14998.59 events/second
     1-minute rate = 14998.79 events/second
     5-minute rate = 14958.66 events/second
    15-minute rate = 14925.51 events/second
string-attributes-generated
             count = 810100
         mean rate = 2999.70 events/second
     1-minute rate = 2999.76 events/second
     5-minute rate = 2991.73 events/second
    15-minute rate = 2985.10 events/second

– Timers ----------------------------------------------------------------------
batches
             count = 13
         mean rate = 0.05 calls/second
     1-minute rate = 0.05 calls/second
     5-minute rate = 0.03 calls/second
    15-minute rate = 0.01 calls/second
               min = 19970.43 milliseconds
               max = 20003.29 milliseconds
              mean = 19999.57 milliseconds
            stddev = 2.95 milliseconds
            median = 19999.36 milliseconds
              75% <= 20000.45 milliseconds
              95% <= 20003.29 milliseconds
              98% <= 20003.29 milliseconds
              99% <= 20003.29 milliseconds
            99.9% <= 20003.29 milliseconds

 

 

 

 

 

Patrick Schweizer May 14, 2022 at 10:37 AM
Edited

To find out the difference in heap usage between newts and ts we ran another test with the following parameters:

ring_buffer_size=16

ring_buffer_size=32768

ring_buffer_size=65536

ring_buffer_size=131072

ring_buffer_size=262144

ring_buffer_size=524288

ring_buffer_size=1048576

=> TS crashes with ring buffer 4x less entries than Newts
=> ring buffer memory footprint of TS is~4x of Newts

Patrick Schweizer May 10, 2022 at 12:06 AM
Edited

Trying to assess the heap usage between Newts and TSS. I did 2 load scenarios. One with Newts and one with time series integration layer (ts). The goal was to get a sense how much more ts uses the heap for the same amount of throughput.

I ran the following scenario:

  • stress command with 15k/s: stress-metrics -n 600 -i 20 -t 8

  • ring buffer size was default: ~8k

  • I rolled back the setting to null code change => all events in the ring buffer should be full after a short amount of time

After running for >10 min I took heap dumps and analyzed them.

Ring Buffer:


=> It looks like the TS memory footprint is ~10x

However what doesn't add up to me is the amount of SampleBatchEvents. It is ~8k for Newts but ~40k for TS. I would expect the same number (since they are recycled):


I tried to verify this number by looking at ring buffer itself in the debugger:


But this seems to look good.
We also need to be careful, the number don't really add up (I think due to multiple references):

Looking at the footprint of a SampleBatchEvent:


It looks like the footprint of Newts sample vs. Ts sample is roughly 4x.

Looking at the overall heap:

Done

Details

Assignee

Reporter

Fix versions

Priority

PagerDuty

Created May 3, 2022 at 10:03 PM
Updated June 7, 2022 at 2:21 PM
Resolved May 18, 2022 at 6:26 PM