A Faster Way of Removing Unused Categories in Pandas?

后端 未结 1 1262
-上瘾入骨i
-上瘾入骨i 2021-01-19 17:40

I\'m running some models in Python, with data subset on categories.

For memory usage, and preprocessing, all the categorical variables are stored as category data t

相关标签:
1条回答
  • 2021-01-19 18:01

    Your problem is in that you are assigning z.get_group(i) to x. x is now a copy of a portion of z. Your code will work fine with this change

    for i in z.groups:
        x = z.get_group(i).copy() # will no longer be tied to z
        x.x = x.x.cat.remove_unused_categories()
    
    0 讨论(0)
提交回复
热议问题