Error only size-1 arrays can be converted to Python scalars

后端未结

关注

 1  1504

I have this code:

for a in data_X:
    for i in a:
        if not i.isdigit():
            x=hash(i)
            data_X[column,row]=x
        row=row+1
    r


                      
              相关标签:


      
      
        
          1条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  一整个雨季        
                
              
                            
                2021-01-17 07:15
              
            
            
                                                                       
You're trying to use a list comprehension to create a new list, like this:

 desired_array = [int(numeric_string) for numeric_string in data_X]


Since data_X is a 2D array, each numeric_string is a 1D array, as long as however many columns you have (at least 7). (The fact that you called it numeric_string doesn't make it a string.) You can't call int on that, for exactly the reason that the error message shows.

If this isn't clear, you should try printing out the values:

for numeric_string in data_X:
    print(numeric_string)


… and it should be pretty clear that numeric_string is not a numeric string.



You could fix this with a nested loop. If you don't understand comprehensions that well, write it with explicit loop statements first:

desired_array = []
for row in data_X:
    desired_row = []
    for col in row:
        desired_row.append(int(col))
    desired_array.append(desired_row)


… and then you can turn it into a comprehension once you're sure you understand it:

desired_array = [int(numeric_string) for numeric_string in row] for row in data_X]




However, that still doesn't give you a 2D array of ints, it gives you a list of list of ints. It's similar, but it's bigger and slower, and you can't call numpy methods on it. (Althouuh you can still pass it to global numpy functions, at least.)

If you wanted to create a 2D array by looping, you could do that.

But as always with numpy, what you want to do, if at all possible, is used vectorized operations instead of loops. It'll be both a lot simpler and a lot faster, with no real downside.

What you probably want is astype:

desired_array = data_X.astype(np.int64)


It's hard to get any simpler than that. And, unless you wanted an array of dtype=object holding Python int values (e.g., because some of your numbers are too big to fit in a native int64), it's exactly what you want.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复