Creating many feature columns in Tensorflow

前端 未结 3 938
轻奢々
轻奢々 2021-01-04 03:42

I\'m getting started on a Tensorflow project, and am in the middle of defining and creating my feature columns. However, I have hundreds and hundreds of features- it\'s a pr

相关标签:
3条回答
  • 2021-01-04 04:26

    I used your own answer. Just edited a little bit (there should be my_columns instead of my_column in for loop) and posting it the way it worked for me.

    import pandas.api.types as ptypes
    
    my_columns = []
    
    for col in df.columns:
      if ptypes.is_string_dtype(df[col]): #is_string_dtype is pandas function
        my_columns.append(tf.feature_column.categorical_column_with_hash_bucket(col, 
            hash_bucket_size= len(df[col].unique())))
    
      elif ptypes.is_numeric_dtype(df[col]): #is_numeric_dtype is pandas function
        my_columns.append(tf.feature_column.numeric_column(col))
    
    0 讨论(0)
  • 2021-01-04 04:37

    What you have posted in the question makes sense. Small extension based on your own code:

    import pandas.api.types as ptypes
    my_columns = []
    for col in df.columns:
      if ptypes.is_string_dtype(df[col]): 
        my_columns.append(tf.feature_column.categorical_column_with_hash_bucket(col, 
            hash_bucket_size= len(df[col].unique())))
    
      elif ptypes.is_numeric_dtype(df[col]): 
        my_columns.append(tf.feature_column.numeric_column(col))
    
      elif ptypes.is_categorical_dtype(df[col]): 
        my_columns.append(tf.feature_column.categorical_column(col, 
            hash_bucket_size= len(df[col].unique())))
    
    0 讨论(0)
  • 2021-01-04 04:41

    The above two methods works only if the data is provided in pandas data frame where you have column name for each column. But, in case you have all numeric column and you don't want to name those columns. for e.g. reading several numerical columns from a numpy array, you can use something like this:-

    feature_column = [tf.feature_column.numeric_column(key='image',shape=(784,))] 
    
    input_fn = tf.estimator.inputs.numpy_input_fn(dict({'image':x_train})  
    

    where X_train is your numy array with 784 columns. You can check this post by Vikas Sangwan for more details.

    0 讨论(0)
提交回复
热议问题