问题
I'm working on a scientific computing code (written in C++), and in addition to performing unit tests for the smaller components, I'd like to do regression testing on some of the numerical output by comparing to a "known-good" answer from previous revisions. There are a few features I'd like:
- Allow comparing numbers to a specified tolerance (for both roundoff error and looser expectations)
- Ability to distinguish between ints, doubles, etc, and to ignore text if necessary
- Well-formatted output to tell what went wrong and where: in a multi-column table of data, only show the column entry that differs
- Return
EXIT_SUCCESS
orEXIT_FAILURE
depending on whether the files match
Are there any good scripts or applications out there that do this, or will I have to roll my own in Python to read and compare output files? Surely I'm not the first person with these kind of requirements.
[The following is not strictly relevant, but it may factor into the decision of what to do. I use CMake and its embedded CTest functionality to drive unit tests that use the Google Test framework. I imagine that it shouldn't be hard to add a few add_custom_command
statements in my CMakeLists.txt
to call whatever regression software I need.]
回答1:
You should go for PyUnit, which is now part of the standard lib under the name unittest. It supports everything you asked for. The tolerance check, e.g., is done with assertAlmostEqual().
回答2:
The ndiff utility may be close to what you're looking for: it's like diff, but it will compare text files of numbers to a desired tolerance.
回答3:
I ended up writing a Python script to do more or less what I wanted.
#!/usr/bin/env python
import sys
import re
from optparse import OptionParser
from math import fabs
splitPattern = re.compile(r',|\s+|;')
class FailObject(object):
def __init__(self, options):
self.options = options
self.failure = False
def fail(self, brief, full = ""):
print ">>>> ", brief
if options.verbose and full != "":
print " ", full
self.failure = True
def exit(self):
if (self.failure):
print "FAILURE"
sys.exit(1)
else:
print "SUCCESS"
sys.exit(0)
def numSplit(line):
list = splitPattern.split(line)
if list[-1] == "":
del list[-1]
numList = [float(a) for a in list]
return numList
def softEquiv(ref, target, tolerance):
if (fabs(target - ref) <= fabs(ref) * tolerance):
return True
#if the reference number is zero, allow tolerance
if (ref == 0.0):
return (fabs(target) <= tolerance)
#if reference is non-zero and it failed the first test
return False
def compareStrings(f, options, expLine, actLine, lineNum):
### check that they're a bunch of numbers
try:
exp = numSplit(expLine)
act = numSplit(actLine)
except ValueError, e:
# print "It looks like line %d is made of strings (exp=%s, act=%s)." \
# % (lineNum, expLine, actLine)
if (expLine != actLine and options.checkText):
f.fail( "Text did not match in line %d" % lineNum )
return
### check the ranges
if len(exp) != len(act):
f.fail( "Wrong number of columns in line %d" % lineNum )
return
### soft equiv on each value
for col in range(0, len(exp)):
expVal = exp[col]
actVal = act[col]
if not softEquiv(expVal, actVal, options.tol):
f.fail( "Non-equivalence in line %d, column %d"
% (lineNum, col) )
return
def run(expectedFileName, actualFileName, options):
# message reporter
f = FailObject(options)
expected = open(expectedFileName)
actual = open(actualFileName)
lineNum = 0
while True:
lineNum += 1
expLine = expected.readline().rstrip()
actLine = actual.readline().rstrip()
## check that the files haven't ended,
# or that they ended at the same time
if expLine == "":
if actLine != "":
f.fail("Tested file ended too late.")
break
if actLine == "":
f.fail("Tested file ended too early.")
break
compareStrings(f, options, expLine, actLine, lineNum)
#print "%3d: %s|%s" % (lineNum, expLine[0:10], actLine[0:10])
f.exit()
################################################################################
if __name__ == '__main__':
parser = OptionParser(usage = "%prog [options] ExpectedFile NewFile")
parser.add_option("-q", "--quiet",
action="store_false", dest="verbose", default=True,
help="Don't print status messages to stdout")
parser.add_option("--check-text",
action="store_true", dest="checkText", default=False,
help="Verify that lines of text match exactly")
parser.add_option("-t", "--tolerance",
action="store", type="float", dest="tol", default=1.e-15,
help="Relative error when comparing doubles")
(options, args) = parser.parse_args()
if len(args) != 2:
print "Usage: numdiff.py EXPECTED ACTUAL"
sys.exit(1)
run(args[0], args[1], options)
回答4:
I know I'm quite late to the party, but a few months ago I wrote the nrtest utility in an attempt to make this workflow easier. It sounds like it might help you too.
Here's a quick overview. Each test is defined by its input files and its expected output files. Following execution, output files are stored in a portable benchmark directory. A second step then compares this benchmark to a reference benchmark. A recent update has enabled user extensions, so you can define comparison functions for your custom data.
I hope it helps.
来源:https://stackoverflow.com/questions/1055471/numerical-regression-testing