Having trouble getting accurate numbers from graphite

|▌冷眼眸甩不掉的悲伤 提交于 2019-12-03 00:33:20

A couple of things to check/try:

Configure Graphite for Statsd

Check to make sure that you've used the retention schema and aggregation settings in Graphite that match how Statsd will be sending data (i.e. it sends one data point per 10 second flush interval).

Run a single Statsd aggregator

Be sure you are only running one instance of Statsd as running multiple statsd daemons will cause metrics to be dropped (as Graphite will be configured to only store one data point for it's highest precision of 10s:6h)

Limit the time range in the UI or URL API to less than 6 hours

When displaying graphs with data that crosses over the 6 hour threshold (e.g. from now to 7 hours ago), you will begin seeing 1 minute worth of aggregated count data for the displayed graph (if you've configured Graphite for statsd with retentions = 10s:6h,1min:7d,10min:5y). Rollups will occur based on the oldest data point in the time range (e.g. now till 7+ days = you'll get 10 min rollups).

If sending sparse or "bursty" data AND displaying old time range (triggering aggregation)

Confirm that your xFilesFactor is low enough that aggregation produces non null values even with a high rate of nulls. For example, 100 requests in the first 10 seconds, and none for the remaining 50 seconds in a minute would cause a storage of 100, null, null, null, null, null which would be summed up to null when the data ages if the XFilesFactor is higher than 1/6. Using the statsd recommended graphite configuration handles this, but it is good to know about... as this can give the appearance of lost data.

Saving schema or aggregation changes

If you changed the graphite schema or aggregation settings after any metrics were stored (in whisper = graphite's storage) you'll need to either delete the .wsp files for the metric (graphite will recreate them) or run whisper-resize.py.

Validating settings

You can verify the settings against some whisper data by running whisper-info.py on a .wsp file. Find the .wsp file for one of your metrics in /graphite/storage/whisper/ Run: whisper-info.py my_metric_data.wsp. whisper-info.py output should tell you more about how the storage settings are working.

TLDR;

You should ensure that Graphite is set to store one data point per 10 second interval for metrics coming from StatsD. You should make sure that Graphite is summing (not averaging) for count data coming from Statsd. Both of these can be handled by using the recommended Statsd configuration settings. Don't run more than one Statsd aggregator. When using the UI, limit the data returned to less than 6 hours OR understand what rollup you are viewing when looking at data that crosses retention thresholds. Lastly, make sure the settings take (if you've already been sending metrics).

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!