How to compare event count value with previous time interval event

蹲街弑〆低调 提交于 2019-12-12 18:12:02

问题


I am looking for whether I can compare the total number of event count for the current one hr interval with the total number of event count with the previous one hour interval and if the current hour count is less than previous hour count then one email should get triggered from Riemann.

I am not sure whether we can store the value and compare it with the current event value because I learned events will get expired due to TTL option in Riemann.

Please correct me if I am wrong and suggest me a reference code to achieve it in Riemann.

Thanks in advance


回答1:


It sounds like you want the rate of change of the count over an hour and then to decide if that rate is negative? One way to do this is just as you describe:

(fold-interval-metric 3600 folds/count                        
   (fixed-event-window 2
    (smap folds/difference
          (where (neg? (:metric event))
                 email))))

and this makes sense. You may find that if you use the built in derivative over time function ddt that and graph it you can spot these problems over much shorter timescales. If your success rate falls to zero on minute three of an hour 57 minutes is a long time for the computer to wait before it calls a human for help. If the rate of change on a 15 minute period approches negative infinity it's very likely that your service just stopped.

I'm fond of wrapping ddt in the exponential weighted moving average ewma so spikes don't set off the alarms and have had an extremely low false positive rate with this pattern:

(ewma 30 (ddt ...your stuff here...))

I often want to compare the rate of the requests to a service with the responses with this pattern which uses ewma ddt and project:

 (pipe ↲ (splitp = service
               "service:input" (ewma 30 ↲)
               "service:output" (ewma 30 ↲)
               bit-bucket) ;; throw out other services here
     (project [(service "service:input")
               (service "service:output")]
              (smap folds/quotient-sloppy
                    (with :service "service-ratio-rate-of-change"
                          (ddt ...your streams here...)))))

If requests are infrequent you will need to play with the interval in all these examples to ensure that the alarms don't go off between events. If your events are infrequent you may also need to set the :ttl on the events high enough that they don't expire while you are agrigating them.

ps: the ↲ can be any symbol(s) you want, I just chose that unicode character.
pss: a false posative rate of one alarm per quarter should be reasonable if you consider these things carefully.



来源:https://stackoverflow.com/questions/39019835/how-to-compare-event-count-value-with-previous-time-interval-event

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!