How to generate many interaction terms in Pandas?
Solution 1
I was now faced with a similar problem, where I needed a flexible way to create specific interactions and looked through StackOverflow. I followed the tip in the comment above of @user333700 and thanks to him found patsy (http://patsy.readthedocs.io/en/latest/overview.html) and after a Google search this scikit-learn integration patsylearn (https://github.com/amueller/patsylearn).
So going through the example of @motam79, this is possible:
import numpy as np
import pandas as pd
from patsylearn import PatsyModel, PatsyTransformer
x = np.array([[ 3, 20, 11],
[ 6, 2, 7],
[18, 2, 17],
[11, 12, 19],
[ 7, 20, 6]])
df = pd.DataFrame(x, columns=["a", "b", "c"])
x_t = PatsyTransformer("a:b + a:c + b:c", return_type="dataframe").fit_transform(df)
This returns the following:
a:b a:c b:c
0 60.0 33.0 220.0
1 12.0 42.0 14.0
2 36.0 306.0 34.0
3 132.0 209.0 228.0
4 140.0 42.0 120.0
I answered to a similar question here, where I provide another example with categorical variables: How can an interaction design matrix be created from categorical variables?
Solution 2
You can use sklearn's PolynomialFeatures function. Here is an example:
Let's assume, this is your design (i.e. feature) matrix:
x = array([[ 3, 20, 11],
[ 6, 2, 7],
[18, 2, 17],
[11, 12, 19],
[ 7, 20, 6]])
x_t = PolynomialFeatures(2, interaction_only=True, include_bias=False).fit_transform(x)
Here is the result:
array([[ 3., 20., 11., 60., 33., 220.],
[ 6., 2., 7., 12., 42., 14.],
[ 18., 2., 17., 36., 306., 34.],
[ 11., 12., 19., 132., 209., 228.],
[ 7., 20., 6., 140., 42., 120.]])
The first 3 features are the original features, and the next three are interactions of the original features.
pdevar
Updated on July 02, 2022Comments
-
pdevar almost 2 years
I would like to estimate an IV regression model using many interactions with year, demographic, and etc. dummies. I can't find an explicit method to do this in Pandas and am curious if anyone has tips.
I'm thinking of trying scikit-learn and this function:
http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.PolynomialFeatures.html