permutations with unique values

前端 未结 19 1454
爱一瞬间的悲伤
爱一瞬间的悲伤 2020-11-22 01:53

itertools.permutations generates where its elements are treated as unique based on their position, not on their value. So basically I want to avoid duplicates like this:

19条回答
  •  北恋
    北恋 (楼主)
    2020-11-22 02:44

    To generate unique permutations of ["A","B","C","D"] I use the following:

    from itertools import combinations,chain
    
    l = ["A","B","C","D"]
    combs = (combinations(l, r) for r in range(1, len(l) + 1))
    list_combinations = list(chain.from_iterable(combs))
    

    Which generates:

    [('A',),
     ('B',),
     ('C',),
     ('D',),
     ('A', 'B'),
     ('A', 'C'),
     ('A', 'D'),
     ('B', 'C'),
     ('B', 'D'),
     ('C', 'D'),
     ('A', 'B', 'C'),
     ('A', 'B', 'D'),
     ('A', 'C', 'D'),
     ('B', 'C', 'D'),
     ('A', 'B', 'C', 'D')]
    

    Notice, duplicates are not created (e.g. items in combination with D are not generated, as they already exist).

    Example: This can then be used in generating terms of higher or lower order for OLS models via data in a Pandas dataframe.

    import statsmodels.formula.api as smf
    import pandas as pd
    
    # create some data
    pd_dataframe = pd.Dataframe(somedata)
    response_column = "Y"
    
    # generate combinations of column/variable names
    l = [col for col in pd_dataframe.columns if col!=response_column]
    combs = (combinations(l, r) for r in range(1, len(l) + 1))
    list_combinations = list(chain.from_iterable(combs))
    
    # generate OLS input string
    formula_base = '{} ~ '.format(response_column)
    list_for_ols = [":".join(list(item)) for item in list_combinations]
    string_for_ols = formula_base + ' + '.join(list_for_ols)
    

    Creates...

    Y ~ A + B + C + D + A:B + A:C + A:D + B:C + B:D + C:D + A:B:C + A:B:D + A:C:D + B:C:D + A:B:C:D'
    

    Which can then be piped to your OLS regression

    model = smf.ols(string_for_ols, pd_dataframe).fit()
    model.summary()
    

提交回复
热议问题