Is this one-hot encoding in TensorFlow fast? Or flawed for any reason?

后端未结

关注

 1  1888

There are a few stack overflow questions about computing one-hot embeddings with TensorFlow, and here is the accepted solution:

num_labels = 10
sparse_labels = t


                      
              相关标签:


      
      
        
          1条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  说谎        
                
              
                            
                2021-02-06 16:39
              
            
            
                                                                       
The one_hot() function in your question looks correct. However, the reason that we do not recommend writing code this way is that it is very memory inefficient. To understand why, let's say you have a batch size of 32, and 1,000,000 classes.


In the version suggested in the tutorial, the largest tensor will be the result of tf.sparse_to_dense(), which will be 32 x 1000000.
In the one_hot() function in the question, the largest tensor will be the result of np.identity(1000000), which is 4 terabytes. Of course, allocating this tensor probably won't succeed. Even if the number of classes were much smaller, it would still waste memory to store all of those zeroes explicitly—TensorFlow does not automatically convert your data to a sparse representation even though it might be profitable to do so.


Finally, I want to offer a plug for a new function that was recently added to the open-source repository, and will be available in the next release. tf.nn.sparse_softmax_cross_entropy_with_logits() allows you to specify a vector of integers as the labels, and saves you from having to build the dense one-hot representation. It should be much more efficient that either solution for large numbers of classes.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复