问题
I'm writing tests for a statistical analysis with hypothesis. Hypothesis led me to a ZeroDivisionError
in my code when it is passed very sparse data. So I adapted my code to handle the exception; in my case, that means log the reason and reraise the exception.
try:
val = calc(data)
except ZeroDivisionError:
logger.error(f"check data: {data}, too sparse")
raise
I need to pass the exception up through the call stack because the top-level caller needs to know there was an exception so that it can pass an error code to the external caller (a REST API request).
Edit: I can't also assign a reasonable value to val
; essentially I need a histogram, and this happens when I'm calculating a reasonable bin width from the data. Obviously this fails when the data is sparse. And without the histogram, the algorithm cannot proceed any further.
Now my issue is, in my test when I do something like this:
@given(dataframe)
def test_my_calc(df):
# code that executes the above code path
hypothesis
keeps generating failing examples that trigger ZeroDivisionError
, and I don't know how to ignore this exception. Normally I would mark a test like this with pytest.mark.xfail(raises=ZeroDivisionError)
, but here I can't do that as the same test passes for well behaved inputs.
Something like this would be ideal:
- continue with the test as usual for most inputs, however
- when
ZeroDivisionError
is raised, skip it as an expected failure.
How could I achieve that? Do I need to put a try: ... except: ...
in the test body as well? What would I need to do in the except block to mark it as an expected failure?
Edit: to address the comment by @hoefling, separating out the failing cases would be the idea solution. But unfortunately, hypothesis
doesn't give me enough handles to control that. At most I can control the total count, and limits (min, max) of the generated data. However the failing cases have a very narrow spread. There is no way for me to control that. I guess that's the point of hypothesis, and maybe I shouldn't be using hypothesis at all for this.
Here's how I generate my data (slightly simplified):
cities = [f"city{i}" for i in range(4)]
cats = [f"cat{i}" for i in range(4)]
@st.composite
def dataframe(draw):
data_st = st.floats(min_value=0.01, max_value=50)
df = []
for city, cat in product(cities, cats):
cols = [
column("city", elements=st.just(city)),
column("category", elements=st.just(cat)),
column("metric", elements=data_st, fill=st.nothing()),
]
_df = draw(data_frames(cols, index=range_indexes(min_size=2)))
# my attempt to control the spread
assume(np.var(_df["metric"]) >= 0.01)
df += [_df]
df = pd.concat(df, axis=0).set_index(["city", "category"])
return df
回答1:
from hypothesis import assume, given, strategies as st
@given(...)
def test_stuff(inputs):
try:
...
except ZeroDivisionError:
assume(False)
The assume
call will tell Hypothesis that this example is "bad" and it should try another, without failing the test. It's equivalent to calling .filter(will_not_cause_zero_division)
on your strategy, if you had such a function. See the docs for details.
来源:https://stackoverflow.com/questions/57208801/exception-handling-and-testing-with-pytest-and-hypothesis