Length of longest word in a list

后端未结

关注

 6  1222

面向向阳花

What is the more pythonic way of getting the length of the longest word:

len(max(words, key=len))

Or:

max(len(w) for w in words)


                      
              相关标签:


      
      
        
          6条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  借酒劲吻你        
                
              
                            
                2020-12-06 19:29
              
            
            
                                                                       
Just for info using ipython %timeit

In [150]: words
Out[150]: ['now', 'is', 'the', 'winter', 'of', 'our', 'partyhat']

In [148]: %timeit max(len(w) for w in words)
100000 loops, best of 3: 1.87 us per loop

In [149]: %timeit len(max(words, key=len))
1000000 loops, best of 3: 1.35 us per loop


Just updated with more words to demonstrate @Omnifarious's point/comment.

In [160]: words = map(string.rstrip, open('/usr/share/dict/words').readlines())

In [161]: len(words)
Out[161]: 235886

In [162]: %timeit max(len(w) for w in words)
10 loops, best of 3: 44 ms per loop

In [163]: %timeit len(max(words, key=len))
10 loops, best of 3: 25 ms per loop

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  猫巷女王i        
                
              
                            
                2020-12-06 19:34
              
            
            
                                                                       
Although:

max(len(w) for w in words)


does kind of "read" easier - you've got the overhead of a generator.

While:

len(max(words, key=len))


can optimise away with the key using builtins and since len is normally a very efficient op for strings, is going to be faster...
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  清歌不尽        
                
              
                            
                2020-12-06 19:36
              
            
            
                                                                       
I know it's been a year now but neverthless, I came up with this:

'''Write a function find_longest_word() that takes a list of words and
returns the length of the longest one.'''

a = ['mamao', 'abacate', 'pera', 'goiaba', 'uva', 'abacaxi', 'laranja', 'maca']

def find_longest_word(a):

    d = []
    for c in a:
        d.append(len(c))
        e = max(d)  #Try "min" :D
    for b in a:
        if len(b) == e:
            print "Length is %i for %s" %(len(b), b)

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  眼角桃花        
                
              
                            
                2020-12-06 19:40
              
            
            
                                                                       
I'd say 

len(max(x, key=len))


looks quite good because you utilize a keyword argument (key) of a built-in (max) with a built-in (len). So basically max(x, key=len) gets you almost the answer. But none of your code variants look particularly un-pythonic to me.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  孤城傲影        
                
              
                            
                2020-12-06 19:43
              
            
            
                                                                       
If you rewrite the generator expression as a map call (or, for 2.x, imap):

max(map(len, words))


… it's actually a bit faster than the key version, not slower.

python.org 64-bit 3.3.0:

In [186]: words = ['now', 'is', 'the', 'winter', 'of', 'our', 'partyhat'] * 100
In [188]: %timeit max(len(w) for w in words)
%10000 loops, best of 3: 90.1 us per loop
In [189]: %timeit len(max(words, key=len))
10000 loops, best of 3: 57.3 us per loop
In [190]: %timeit max(map(len, words))
10000 loops, best of 3: 53.4 us per loop


Apple 64-bit 2.7.2:

In [298]: words = ['now', 'is', 'the', 'winter', 'of', 'our', 'partyhat'] * 100
In [299]: %timeit max(len(w) for w in words)
10000 loops, best of 3: 99 us per loop
In [300]: %timeit len(max(words, key=len))
10000 loops, best of 3: 64.1 us per loop
In [301]: %timeit max(map(len, words))
10000 loops, best of 3: 67 us per loop
In [303]: %timeit max(itertools.imap(len, words))
10000 loops, best of 3: 63.4 us per loop


I think it's more pythonic than the key version, for the same reason the genexp is.

It's arguable whether it's as pythonic as the genexp version. Some people love map/filter/reduce/etc.; some hate them; my personal feeling is that when you're trying to map a function that already exists and has a nice name (that is, something you don't have to lambda or partial up), map is nicer, but YMMV (especially if your name is Guido).

One last point:


  the redundancy of len being called twice seems not to matter - does more happen in C code in this form?


Think about it like this: You're already calling len N times. Calling it N+1 times instead is hardly likely to make a difference, compared to anything you have to do N times, unless you have a tiny number of huge strings.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  离开以前        
                
              
                            
                2020-12-06 19:44
              
            
            
                                                                       
I think both are OK, but I think that unless speed is a big consideration that max(len(w) for w in words) is the most readable.

When I was looking at them, it took me longer to figure out what len(max(words, key=len)) was doing, and I was still wrong until I thought about it more. Code should be immediately obvious unless there's a good reason for it not to be.

It's clear from the other posts (and my own tests) that the less readable one is faster. But it's not like either of them are dog slow. And unless the code is on a critical path it's not worth worrying about.

Ultimately, I think more readable is more Pythonic.

As an aside, this one of the few cases in which Python 2 is notably faster than Python 3 for the same task.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复