Tensorflow One Hot Encoder?

后端未结

关注

 15  1635

Does tensorflow have something similar to scikit learn\'s one hot encoder for processing categorical data? Would using a placeholder of tf.string behave as categorical data

相关标签:

15条回答

长情又很酷

2020-12-04 14:38

Maybe it's due to changes to Tensorflow since Nov 2015, but @dga's answer produced errors. I did get it to work with the following modifications:

sparse_labels = tf.reshape(label_batch, [-1, 1])
derived_size = tf.shape(sparse_labels)[0]
indices = tf.reshape(tf.range(0, derived_size, 1), [-1, 1])
concated = tf.concat(1, [indices, sparse_labels])
outshape = tf.concat(0, [tf.reshape(derived_size, [1]), tf.reshape(num_labels, [1])])
labels = tf.sparse_to_dense(concated, outshape, 1.0, 0.0)

0 讨论(0)

予麋鹿

2020-12-04 14:39

You can use tf.sparse_to_dense:

The sparse_indices argument indicates where the ones should go, output_shape should be set to the number of possible outputs (e.g. the number of labels), and sparse_values should be 1 with the desired type (it will determine the type of the output from the type of sparse_values).

0 讨论(0)
发布评论:

提交评论
- 加载中...
时光说笑

2020-12-04 14:40

There's embedding_ops in Scikit Flow and examples that deal with categorical variables, etc.

If you just begin to learn TensorFlow, I would suggest you trying out examples in TensorFlow/skflow first and then once you are more familiar with TensorFlow it would be fairly easy for you to insert TensorFlow code to build a custom model you want (there are also examples for this).

Hope those examples for images and text understanding could get you started and let us know if you encounter any issues! (post issues or tag skflow in SO).

0 讨论(0)
发布评论:

提交评论
- 加载中...

渐次进展

2020-12-04 14:41

In [7]: one_hot = tf.nn.embedding_lookup(np.eye(5), [1,2])

In [8]: one_hot.eval()
Out[8]: 
array([[ 0.,  1.,  0.,  0.,  0.],
       [ 0.,  0.,  1.,  0.,  0.]])

works on TF version 1.3.0. As of Sep 2017.

0 讨论(0)

我在风中等你

2020-12-04 14:42

My version of @CFB and @dga example, shortened a bit to ease understanding.

num_labels = 10
labels_batch = [2, 3, 5, 9]

sparse_labels = tf.reshape(labels_batch, [-1, 1])
derived_size = len(labels_batch)
indices = tf.reshape(tf.range(0, derived_size, 1), [-1, 1])
concated = tf.concat(1, [indices, sparse_labels]) 
labels = tf.sparse_to_dense(concated, [derived_size, num_labels], 1.0, 0.0)

0 讨论(0)

旧巷少年郎

2020-12-04 14:48

Take a look at tf.nn.embedding_lookup. It maps from categorical IDs to their embeddings.

For an example of how it's used for input data, see here.

0 讨论(0)
发布评论:

提交评论
- 加载中...