There are a few stack overflow questions about computing one-hot embeddings with TensorFlow, and here is the accepted solution:
num_labels = 10
sparse_labels = t
The one_hot()
function in your question looks correct. However, the reason that we do not recommend writing code this way is that it is very memory inefficient. To understand why, let's say you have a batch size of 32, and 1,000,000 classes.
In the version suggested in the tutorial, the largest tensor will be the result of tf.sparse_to_dense(), which will be 32 x 1000000
.
In the one_hot()
function in the question, the largest tensor will be the result of np.identity(1000000)
, which is 4 terabytes. Of course, allocating this tensor probably won't succeed. Even if the number of classes were much smaller, it would still waste memory to store all of those zeroes explicitly—TensorFlow does not automatically convert your data to a sparse representation even though it might be profitable to do so.
Finally, I want to offer a plug for a new function that was recently added to the open-source repository, and will be available in the next release. tf.nn.sparse_softmax_cross_entropy_with_logits() allows you to specify a vector of integers as the labels, and saves you from having to build the dense one-hot representation. It should be much more efficient that either solution for large numbers of classes.