Tensorflow One Hot Encoder?

后端未结

关注

 15  1636

Does tensorflow have something similar to scikit learn\'s one hot encoder for processing categorical data? Would using a placeholder of tf.string behave as categorical data

相关标签:

15条回答

南笙

2020-12-04 14:48

Current versions of tensorflow implement the following function for creating one-hot tensors:

https://www.tensorflow.org/versions/master/api_docs/python/array_ops.html#one_hot

0 讨论(0)
发布评论:

提交评论
- 加载中...
别那么骄傲

2020-12-04 14:49
As of TensorFlow 0.8, there is now a native one-hot op, tf.one_hot that can convert a set of sparse labels to a dense one-hot representation. This is in addition to tf.nn.sparse_softmax_cross_entropy_with_logits, which can in some cases let you compute the cross entropy directly on the sparse labels instead of converting them to one-hot.

Previous answer, in case you want to do it the old way: @Salvador's answer is correct - there (used to be) no native op to do it. Instead of doing it in numpy, though, you can do it natively in tensorflow using the sparse-to-dense operators:
```
num_labels = 10

# label_batch is a tensor of numeric labels to process
# 0 <= label < num_labels

sparse_labels = tf.reshape(label_batch, [-1, 1])
derived_size = tf.shape(label_batch)[0]
indices = tf.reshape(tf.range(0, derived_size, 1), [-1, 1])
concated = tf.concat(1, [indices, sparse_labels])
outshape = tf.pack([derived_size, num_labels])
labels = tf.sparse_to_dense(concated, outshape, 1.0, 0.0)
```
The output, labels, is a one-hot matrix of batch_size x num_labels.

Note also that as of 2016-02-12 (which I assume will eventually be part of a 0.7 release), TensorFlow also has the tf.nn.sparse_softmax_cross_entropy_with_logits op, which in some cases can let you do training without needing to convert to a one-hot encoding.

Edited to add: At the end, you may need to explicitly set the shape of labels. The shape inference doesn't recognize the size of the num_labels component. If you don't need a dynamic batch size with derived_size, this can be simplified.

Edited 2016-02-12 to change the assignment of outshape per comment below.
0 讨论(0)
发布评论:

提交评论
- 加载中...
后悔当初

2020-12-04 14:51

Recent versions of TensorFlow (nightlies and maybe even 0.7.1) have an op called tf.one_hot that does what you want. Check it out!

On the other hand if you have a dense matrix and you want to look up and aggregate values in it, you would want to use the embedding_lookup function.

0 讨论(0)
发布评论:

提交评论
- 加载中...

上一页 1 2 3