In-place way to apply a permutation to a list? (inverse of sorting-by-key)

后端未结

关注

 7  627

Here\'s a example of what I want to do

spam_list = [\"We\", \"are\", \"the\", \"knights\", \"who\", \"say\", \"Ni\"]
spam_order = [0,1,2,4,5,6,3]
spam_list.m


                      
              相关标签:


      
      
        
          7条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  北荒        
                
              
                            
                2021-01-02 05:21
              
            
            
                                                                       
map(lambda x:spam_list[x], spam_order)

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  萌比男神i        
                
              
                            
                2021-01-02 05:26
              
            
            
                                                                       

  but I would like to directly affect spam_list, like list.sort() and not copy it like sorted()


There is ONLY ONE SOLUTION, that does exactly what you ask. Every single other solution is implicitly making a copy of one or both lists (or turning it into a dict, etc.). What you are asking for is a method which sorts two lists in-place, using O(1) extra space, using one list as the keys of the other. I personally would just accept the extra space complexity, but if you really wanted to, you could do this:

(edit: it may be the case that the original poster doesn't really care about .sort because it's efficient, but rather because it modifies state; in general this is a dangerous thing to want and non-low-level languages attempt to avoid this and even ban it, but the solutions which use slice assignment will achieve "in-place" semantics)


Create a custom dictionary subclass (effectively a Zip class) which is backed by both lists you are sorting.
Indexing myZip[i] -> results in the tuple (list1[i],list2[i])
Assignment myZip[i]=(x1,x2) -> dispatches into list1[i]=x1, list2[i]=x2.
Use that to do myZip(spam_list,spam_order).sort(), and now both spam_list and spam_order are sorted in-place


Example:

#!/usr/bin/python3

class LiveZip(list):
    def __init__(self, list1, list2):
        self.list1 = list1
        self.list2 = list2

    def __len__(self):
        return len(self.list1)

    def __getitem__(self, i):
        return (self.list1[i], self.list2[i])

    def __setitem__(self, i, tuple):
        x1,x2 = tuple
        self.list1[i] = x1
        self.list2[i] = x2

spam_list = ["We", "are", "the", "knights", "who", "say", "Ni"]
spam_order = [0,1,2,4,5,6,3]

#spam_list.magical_sort(spam_order)
proxy = LiveZip(spam_order, spam_list)


Now let's see if it works...

#proxy.sort()
#fail --> oops, the internal implementation is not meant to be subclassed! lame
# It turns out that the python [].sort method does NOT work without passing in
# a list to the constructor (i.e. the internal implementation does not use the
# public interface), so you HAVE to implement your own sort if you want to not
# use any extra space. This kind of dumb. But the approach above means you can 
# just use any standard textbook in-place sorting algorithm:
def myInPlaceSort(x):
    # [replace with in-place textbook sorting algorithm]


NOW it works:

myInPlaceSort(proxy)

print(spam_list)


Unfortunately there is no way to just sort one list in O(1) space without sorting the other; if you don't want to sort both lists, you might as well do your original approach which constructs a dummy list.

You can however do the following:

spam_list.sort(key=lambda x:x)

but if the key or cmp functions makes any references to any collection (e.g. if you pass in a dict.__getitem__ of a dict you had to construct) this is no better than your original O(N)-space approach, unless you already happened to have such a dictionary lying around.

Turns out this is a duplicate question of Python sort parallel arrays in place? , but that question also had no correct answers except this one , which is equivalent to mine but without the sample code. Unless you are incredibly optimized or specialized code, I'd just use your original solution, which is equivalent in space complexity to the other solutions.

edit2:
As senderle pointed out, the OP doesn't want a sort at all, but rather wishes to, I think, apply a permutation. To achieve this, you can and SHOULD use simply indexing that other answers suggest [spam_list[i] for i in spam_order], but an explicit or implicit copy must be made still because you still need the intermediate data. (Unrelated and for the record, applying the inverse permutation is I think the inverse of parallel sorting with the identity, and you can use one to get the other, though sorting is less time-efficient. _,spam_order_inverse = parallelSort(spam_order, range(N)), then sort by spam_order_inverse. I leave the above discussion about sorting up for the record.)

edit3:

It is possible, however, to achieve an in-place permutation in O(#cycles) space but with terrible time efficiency. Every permutation can be decomposed into disjoint permutations applied in parallel on subsets. These subsets are called cycles or orbits. The period is equal to their size. You thus take a leap of faith and do as follows:

Create a temp variable.

For index i=0...N:
    Put x_i into temp, assign NULL to x_i
    Swap temp with x_p(i)
    Swap temp with x_p(p(i))
    ...
    Swap temp with x_p(..p(i)..), which is x_i
    Put a "do not repeat" marker on the smallest element you visited larger than i
    Whenever you encounter a "do not repeat" marker, perform the loop again but
      without swapping, moving the marker to the smallest element larger than i    
    To avoid having to perform the loop again, use a bloom filter


This will run in O(N^2) time and O(#cycles) place without a bloom filter, or ~O(N) time and O(#cycle + bloomfilter_space) space if you use them
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  -上瘾入骨i        
                
              
                            
                2021-01-02 05:26
              
            
            
                                                                       
If you actually don't care about the efficiency at all, and just want in-place semantics (which is a bit odd, because there are entire programming languages dedicated to avoiding in-place semantics), then you can do this:

def modifyList(toModify, newList):
    toModify[:] = newList

def permuteAndUpdate(toPermute, permutation):
    modifyList(toPermute, [toPermute[i] for i in permutation])

permuteAndUpdate(spam_list, spam_order)

print(spam_list)
# ['We', 'are', 'the', 'Ni', 'knights', 'who', 'say']


Credit goes to senderle for recognizing that this is what the OP may actually be after; he should feel free to copy this answer into his own. Should not accept this answer unless you really prefer it over his.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  挽巷        
                
              
                            
                2021-01-02 05:27
              
            
            
                                                                       
You can give a special key to the sort function:

order = dict(zip(spam_list, spam_order))
spam_list.sort(key=order.get)


Edit: As @ninjagecko points out in his answer, this is not really efficient, as it copies both lists to create the dictionary for the lookup. However, with the modified example given by the OP, this is the only way, because one has to build some index. The upside is that, at least for the strings, the values will not be copied, so the overhead is just that of the dictionary itself.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  一整个雨季        
                
              
                            
                2021-01-02 05:27
              
            
            
                                                                       
You may use numpy.

import numpy as np
spam_list = list(np.array(spam_list)[spam_order])

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  佛祖请我去吃肉        
                
              
                            
                2021-01-02 05:35
              
            
            
                                                                       
You could try:

spam_list = [spam_list[i] for i in spam_order]

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
   
          
     1
2
下一页
           
           
        
                                  
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复