Patsy's power doesn't allow for negative integers, so, if we have some series data X
,
patsy.dmatrices('X + X**(-1)', X)
returns an error. How would I add the reciprocal of X to such a patsy formula?
The special patsy meaning of operators gets switched off inside embedded function calls; so if you write X + 1 / x
then patsy interprets that as the special patsy +
and /
operators, but if you write something like X + sin(1 / X)
, then patsy continues to interpret the +
as a special patsy operator, but the whole sin(1 / X)
expression gets passed to Python to evaluate, and Python will evaluate the /
as regular division.
So that's fine if we wanted to compute sin(1 / X)
. But we don't (why would we?). We just want plain 1 / X
. So how can we do that?
Well, we can be tricky: we need a function call to trick patsy's parser into ignoring the /
and giving it to Python -- but there's nothing that says that function has to do anything. We could just define an identify function:
def identity(value):
return value
and then use that in a formula like X + identity(1 / X)
.
And in fact, this trick is so handy that patsy has already predefined an function for you, and provides it as a built-in called I(...)
. Generally, you can think of I(...)
as a kind of quoting operator -- it's a way to say "hey patsy, please do not try to interpret anything in this region, just pass it through to Python kthx".
So to answer your original question: try writing dmatrix("X + I(1 / X)", data)
(Next question: why this weird hack with the function I
and everything? The answer to that is that this is how R did it 30 years ago, and I couldn't think of anything sufficiently better to be worth breaking compatibility.)
来源:https://stackoverflow.com/questions/32484244/reciprocals-in-patsy