Exception handling and testing with pytest and hypothesis

问题

I'm writing tests for a statistical analysis with hypothesis. Hypothesis led me to a ZeroDivisionError in my code when it is passed very sparse data. So I adapted my code to handle the exception; in my case, that means log the reason and reraise the exception.

try:
    val = calc(data)
except ZeroDivisionError:
    logger.error(f"check data: {data}, too sparse")
    raise

I need to pass the exception up through the call stack because the top-level caller needs to know there was an exception so that it can pass an error code to the external caller (a REST API request).

Edit: I can't also assign a reasonable value to val; essentially I need a histogram, and this happens when I'm calculating a reasonable bin width from the data. Obviously this fails when the data is sparse. And without the histogram, the algorithm cannot proceed any further.

Now my issue is, in my test when I do something like this:

@given(dataframe)
def test_my_calc(df):
    # code that executes the above code path

hypothesis keeps generating failing examples that trigger ZeroDivisionError, and I don't know how to ignore this exception. Normally I would mark a test like this with pytest.mark.xfail(raises=ZeroDivisionError), but here I can't do that as the same test passes for well behaved inputs.

Something like this would be ideal:

continue with the test as usual for most inputs, however
when ZeroDivisionError is raised, skip it as an expected failure.

How could I achieve that? Do I need to put a try: ... except: ... in the test body as well? What would I need to do in the except block to mark it as an expected failure?

Edit: to address the comment by @hoefling, separating out the failing cases would be the idea solution. But unfortunately, hypothesis doesn't give me enough handles to control that. At most I can control the total count, and limits (min, max) of the generated data. However the failing cases have a very narrow spread. There is no way for me to control that. I guess that's the point of hypothesis, and maybe I shouldn't be using hypothesis at all for this.

Here's how I generate my data (slightly simplified):

cities = [f"city{i}" for i in range(4)]
cats = [f"cat{i}" for i in range(4)]


@st.composite
def dataframe(draw):
    data_st = st.floats(min_value=0.01, max_value=50)
    df = []
    for city, cat in product(cities, cats):
        cols = [
            column("city", elements=st.just(city)),
            column("category", elements=st.just(cat)),
            column("metric", elements=data_st, fill=st.nothing()),
        ]
        _df = draw(data_frames(cols, index=range_indexes(min_size=2)))
        # my attempt to control the spread
        assume(np.var(_df["metric"]) >= 0.01)
        df += [_df]
    df = pd.concat(df, axis=0).set_index(["city", "category"])
    return df

回答1:

from hypothesis import assume, given, strategies as st

@given(...)
def test_stuff(inputs):
    try:
        ...
    except ZeroDivisionError:
        assume(False)

The assume call will tell Hypothesis that this example is "bad" and it should try another, without failing the test. It's equivalent to calling .filter(will_not_cause_zero_division) on your strategy, if you had such a function. See the docs for details.

来源：https://stackoverflow.com/questions/57208801/exception-handling-and-testing-with-pytest-and-hypothesis

标签

python

pytest

expected-exception

python-hypothesis