I\'m trying to understand how to replicate the poly() function in R using scikit-learn (or other module).
For example, let\'s say I have a vector in R:
a
It turns out that you can replicate the result of R's poly(x,p)
function by performing a QR decomposition of a matrix whose columns are the powers of the input vector x
from the 0th power (all ones) up to the p
th power. The Q matrix, minus the first constant column, gives you the result you want.
So, the following should work:
import numpy as np
def poly(x, p):
x = np.array(x)
X = np.transpose(np.vstack((x**k for k in range(p+1))))
return np.linalg.qr(X)[0][:,1:]
In particular:
In [29]: poly([1,2,3,4,5,6,7,8,9,10], 3)
Out[29]:
array([[-0.49543369, 0.52223297, 0.45342519],
[-0.38533732, 0.17407766, -0.15114173],
[-0.27524094, -0.08703883, -0.37785433],
[-0.16514456, -0.26111648, -0.33467098],
[-0.05504819, -0.34815531, -0.12955006],
[ 0.05504819, -0.34815531, 0.12955006],
[ 0.16514456, -0.26111648, 0.33467098],
[ 0.27524094, -0.08703883, 0.37785433],
[ 0.38533732, 0.17407766, 0.15114173],
[ 0.49543369, 0.52223297, -0.45342519]])
In [30]:
The answer by K. A. Buhr is full and complete.
The R poly function also calculates interactions of different degrees of the members. That's why I was looking for the R poly equivalent.
sklearn.preprocessing.PolynomialFeatures Seems to provide such, you can do the np.linalg.qr(X)[0][:,1:]
step after to get the orthogonal matrix.
Something like this:
import numpy as np
import pprint
import sklearn.preprocessing
PP = pprint.PrettyPrinter(indent=4)
MATRIX = np.array([[ 4, 2],[ 2, 3],[ 7, 4]])
poly = sklearn.preprocessing.PolynomialFeatures(2)
PP.pprint(MATRIX)
X = poly.fit_transform(MATRIX)
PP.pprint(X)
Results in:
array([[4, 2],
[2, 3],
[7, 4]])
array([[ 1., 4., 2., 16., 8., 4.],
[ 1., 2., 3., 4., 6., 9.],
[ 1., 7., 4., 49., 28., 16.]])