ValueError: Grouper for not 1-dimensional

前端未结

关注

 5  1841

I\'m have the following code which creates a table and a barplot via seaborn.

#Building a dataframe grouped by the # of Engagement Types
sales_type = sales.g


                      
              相关标签:


      
      
        
          5条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  醉梦人生        
                
              
                            
                2020-12-30 19:44
              
            
            
                                                                       
Happened to me when I was using df instead of pd as:
df.pivot_table(df[["....

instead of
pd.pivot_table(df[["...

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  傲寒        
                
              
                            
                2020-12-30 19:47
              
            
            
                                                                       
Something to add to @w-m's answer.

If you are adding multiple columns from one dataframe to another:

df1[['col1', 'col2']] = df2[['col1', 'col2']]


it will create a multi-column index and if you try to group by anything on df1, it will give you this error.

To solve this, get rid of the multi-index by using

df1.columns = df1.columns.get_level_values(0)

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  天命终不由人        
                
              
                            
                2020-12-30 19:53
              
            
            
                                                                       
Simplified problem
I also ran into this problem, and found the cause of it and the obvious solution
To recreate this:
df = pd.DataFrame({"foo": [1,2,3], "bar": [1,2,3]})
df.rename(columns={'foo': 'bar'}, inplace=True)

   bar  bar
0    1    1
1    2    2
2    3    3

df.groupby('bar')

ValueError: Grouper for 'bar' not 1-dimensional

Just like a lot of cryptic pandas errors, this one too stems from having two columns with the same  name.
Figure out which one you want to use, rename or drop the other column and redo the operation.
Solution
Rename the columns like this
df.columns = ['foo', 'bar']

   foo  bar
0    1    1
1    2    2
2    3    3

df.groupby('bar')
<pandas.core.groupby.DataFrameGroupBy object at 0x1066dd950>

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  感动是毒        
                
              
                            
                2020-12-30 20:03
              
            
            
                                                                       
TL;DR: what it is saying is really: for some or all indexes in df, you are assigning MORE THAN just one label, groupby() doesn't know which label it should use for grouping.
First of all, just to make sure we truly understand what groupby() does.
We will be using this example df thru out:
import pandas as pd
import numpy as np
df = pd.DataFrame(
    {"fruit": ['apple', 'apple', 'orange', 'orange'], "color": ['r', 'g', 'b', 'r']},
    index=[11, 22, 33, 44],
)

"""
[df] df:
+----+---------+---------+
|    | fruit   | color   |
|----+---------+---------|
| 11 | apple   | r       |
| 22 | apple   | g       |
| 33 | orange  | b       |
| 44 | orange  | r       |
+----+---------+---------+
"""

Here is a valid df.groupby():
gp = df.groupby(
    {
        0: 'mine',
        1: 'mine',
        11: 'mine',
        22: 'mine',
        33: 'mine',
        44: 'you are rats with wings!',
    }
)
"""
[df] [group] mine:
+----+---------+---------+
|    | fruit   | color   |
|----+---------+---------|
| 11 | apple   | r       |
| 22 | apple   | g       |
| 33 | orange  | b       |
+----+---------+---------+

[df] [group] you are rats with wings!:
+----+---------+---------+
|    | fruit   | color   |
|----+---------+---------|
| 44 | orange  | r       |
+----+---------+---------+
"""

groupby() doesn't need to care about df or 'fruit' or 'color' or Nemo, groupby() only cares about one thing, a lookup table that tells it which df.index is mapped to which label (ie. group name).
In this case, for example, the dictionary passed to the groupby() is instructing the groupby() to:

if you see index 11, then it is a "mine", put the row with that index in the group named "mine".

if you see index 22, then it is a "mine", put the row with that index in the group named "mine".

...

even 0 and 1 not being in df.index is not a problem
Conventional df.groupby('fruit') or df.groupby(df['fruit']) follows exactly the rule above. df['fruit'] is the lookup table, it tells groupby() that index 11 is an "apple"
Now, regarding: Grouper for '<class 'pandas.core.frame.DataFrame'>' not 1-dimensional
what it is saying is really: for some or all indexes in df, you are assigning MORE THAN just one label
[1] df.groupby(df) in this example will not work, groupby() will complain: is index 11 an "apple" or an "r"? make up your mind!
[2] the below will also not work, although the mapping is 1D, it is mapping index 11 to mine as well as yours. df and sr allow none-unique index, so be careful.
mapping = pd.DataFrame(index= [ 11,     11,      22,     33,     44    ], 
                       data = ['mine', 'yours', 'mine', 'mine', 'yours'], )
df.groupby(mapping)

# different error message, but same idea
mapping = pd.Series(   index= [ 11,     11,      22,     33,     44    ], 
                       data = ['mine', 'yours', 'mine', 'mine', 'yours'], )
df.groupby(mapping)

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  傲寒        
                
              
                            
                2020-12-30 20:09
              
            
            
                                                                       
Happened to me when I accidentally created MultiIndex columns:

>>> values = np.asarray([[1, 1], [2, 2], [3, 3]])

# notice accidental double brackets around column list
>>> df = pd.DataFrame(values, columns=[["foo", "bar"]])

# prints very innocently
>>> df
  foo bar
0   1   1
1   2   2
2   3   3

# but throws this error
>>> df.groupby("foo")
ValueError: Grouper for 'foo' not 1-dimensional

# cause:
>>> df.columns
MultiIndex(levels=[['bar', 'foo']],
           labels=[[1, 0]])

# fix by using correct columns list
>>> df = pd.DataFrame(values, columns=["foo", "bar"])
>>> df.groupby("foo")
<pandas.core.groupby.groupby.DataFrameGroupBy object at 0x7f9a280cbb70>

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复