Pandas Series of lists to one series

后端未结

关注

 9  2158

I have a Pandas Series of lists of strings:

0                           [slim, waist, man]
1                                [slim, waistline]
2


                      
              相关标签:


      
      
        
          9条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  无人共我        
                
              
                            
                2020-12-28 13:33
              
            
            
                                                                       
Flattening and unflattening can be done using this function

def flatten(df, col):
    col_flat = pd.DataFrame([[i, x] for i, y in df[col].apply(list).iteritems() for x in y], columns=['I', col])
    col_flat = col_flat.set_index('I')
    df = df.drop(col, 1)
    df = df.merge(col_flat, left_index=True, right_index=True)

    return df


Unflattening:

def unflatten(flat_df, col):
    flat_df.groupby(level=0).agg({**{c:'first' for c in flat_df.columns}, col: list})


After unflattening we get the same dataframe except column order:

(df.sort_index(axis=1) == unflatten(flatten(df)).sort_index(axis=1)).all().all()
>> True

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  北荒        
                
              
                            
                2020-12-28 13:34
              
            
            
                                                                       
You are basically just trying to flatten a nested list here.

You should just be able to iterate over the elements of the series:

slist =[]
for x in series:
    slist.extend(x)


or a slicker (but harder to understand) list comprehension:

slist = [st for row in s for st in row]

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  南笙        
                
              
                            
                2020-12-28 13:36
              
            
            
                                                                       
Here's a simple method using only pandas functions:

import pandas as pd

s = pd.Series([
    ['slim', 'waist', 'man'],
    ['slim', 'waistline'],
    ['santa']])


Then

s.apply(pd.Series).stack().reset_index(drop=True)


gives the desired output. In some cases you might want to save the original index and add a second level to index the nested elements, e.g.

0  0         slim
   1        waist
   2          man
1  0         slim
   1    waistline
2  0        santa


If this is what you want, just omit .reset_index(drop=True) from the chain.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  礼貌的吻别        
                
              
                            
                2020-12-28 13:37
              
            
            
                                                                       
If your pandas version is too old to use series_name.explode(), this should work too:

from itertools import chain

pd.Series(
    chain.from_iterable(
        value
        for i, value
        in series_name.iteritems()
    )
)

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  慢半拍i        
                
              
                            
                2020-12-28 13:38
              
            
            
                                                                       
You can try using itertools.chain to simply flatten the lists:

In [70]: from itertools import chain
In [71]: import pandas as pnd
In [72]: s = pnd.Series([['slim', 'waist', 'man'], ['slim', 'waistline'], ['santa']])
In [73]: s
Out[73]: 
0    [slim, waist, man]
1     [slim, waistline]
2               [santa]
dtype: object
In [74]: new_s = pnd.Series(list(chain(*s.values)))
In [75]: new_s
Out[75]: 
0         slim
1        waist
2          man
3         slim
4    waistline
5        santa
dtype: object

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  攒了一身酷        
                
              
                            
                2020-12-28 13:39
              
            
            
                                                                       
You can use the list concatenation operator like below -

lst1 = ['hello','world']
lst2 = ['bye','world']
newlst = lst1 + lst2
print(newlst)
>> ['hello','world','bye','world']


Or you can use list.extend() function as below -

lst1 = ['hello','world']
lst2 = ['bye','world']
lst1.extend(lst2)
print(lst1)
>> ['hello', 'world', 'bye', 'world']


Benefits of using extend function is that it can work on multiple types, where as concatenation operator will only work if both LHS and RHS are lists.

Other examples of extend function -

lst1.extend(('Bye','Bye'))
>> ['hello', 'world', 'Bye', 'Bye']

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
   
          
     1
2
下一页
           
           
        
                                  
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复