问题
I have several graphs created by RRDTool that collected bad data during a time period of a couple hours.
How can I remove the data from the RRD's during that time period so that it no longer displays?
回答1:
Best method I found to do this...
- Use RRDTool Dump to export RRD files to XML.
- Open the XML file, find and edit the bad data.
- Restore the RRD file using RRDTool Restore .
回答2:
I had a similar problem where I wanted to discard the most recent few hours from my RRDtool databases, so I wrote a quick script to do it (apologies for the unconventional variable names - coding style inherited from work, sigh):
#!/usr/bin/env python2
"""
Modify XML data generated by `rrdtool dump` such that the last update was at
the unixtime specified (decimal). Data newer than this is simply omitted.
Sample usage::
rrdtool dump foo.rrd \
| python remove_samples_newer_than.py 1414782122 \
| rrdtool restore - foo_trimmed.rrd
"""
import sys
assert sys.argv[1:], "Must specify maximum Unix timestamp in decimal"
iMaxUpdate = int(sys.argv[1])
for rLine in iter(sys.stdin.readline, ''):
if "<lastupdate>" in rLine:
# <lastupdate>1414782122</lastupdate> <!-- 2014-10-31 19:02:02 GMT -->
_, _, rData = rLine.partition("<lastupdate>")
rData, _, _ = rData.partition("</lastupdate")
iLastUpdate = int(rData)
assert iLastUpdate < iMaxUpdate, "Last update in RRD older than " \
"the time you provided, nothing to do"
print "<lastupdate>{0}</lastupdate>".format(iMaxUpdate)
elif "<row>" in rLine:
# <!-- 2014-10-17 20:04:00 BST / 1413572640 --> <row><v>9.8244774011e+01</v><v>8.5748587571e-01</v><v>4.2046610169e+00</v><v>9.3016101695e+01</v><v>5.0000000000e-02</v><v>1.6652542373e-01</ v><v>1.1757062147e+00</v><v>1.6901226735e+10</v><v>4.2023108608e+09</v><v>2.1457537707e+08</v><v>3.9597816832e+09</v><v>6.8812800000e+05</v><v>3.0433198080e+09</v><v>6.0198912250e+06</v><v>2. 0000000000e+00</v><v>0.0000000000e+00</v></row>
rData, _, _ = rLine.partition("<row>")
_, _, rData = rData.partition("/")
rData, _, _ = rData.partition("--")
rData = rData.strip()
iUpdate = int(rData)
if iUpdate < iMaxUpdate:
print rLine,
else:
print rLine,
Worked for me. Hope it helps someone else.
回答3:
If you want to avoid writing and editing of xml file as this may takes few file IO calls(based on how much bad data you have) , you can also read entire rrd into memory using fetch and update values in-memory.
I did similar task using python + rrdtool and i ended up doing :
- read rrd in-memory in a dictionary
- fix values in the dictionary
- delete existing rrd file
- create new rrd with same name.
回答4:
The only who proposed, what exactly to edit, was RobM. I tried his solution, and it did not work for me in rrdtool 1.4.7
My database uses AVERAGE, MAX and MIN. It contains DERIVE, GAUGE and COMPUTED. Intervals: second (70), minute (70), hour (25), day (367). My task: delete some last part (typical reason: clock moved back).
I applied RobM's solution: change to my new end time, delete all after it. Restored database seemed to be normal. But it did not accept new additions. I examined a newly created empty database. And I found in it 70 second records with NaN, same for minute and hour.
So, my working solution - if I delete records in some period end, I add the same number of NaN records in this period beginning, with correctly decreasing times. Exception - daily records, they are only deleted without addition. If period becomes empty after deletes, I fill it with NaN records ending to my new end time (rounded to the period boundary).
来源:https://stackoverflow.com/questions/10298484/remove-data-from-rrdtool