Select columns in Pyspark Dataframe

前端 未结 6 1567
小鲜肉
小鲜肉 2021-02-03 21:45

I am looking for a way to select columns of my dataframe in pyspark. For the first row, I know I can use df.first() but not sure about columns given that they do

6条回答
  •  走了就别回头了
    2021-02-03 22:37

    Use df.schema.names:

    spark.version
    # u'2.2.0'
    
    df = spark.createDataFrame([("foo", 1), ("bar", 2)])
    df.show()
    # +---+---+ 
    # | _1| _2|
    # +---+---+
    # |foo|  1| 
    # |bar|  2|
    # +---+---+
    
    df.schema.names
    # ['_1', '_2']
    
    for i in df.schema.names:
      # df_new = df.withColumn(i, [do-something])
      print i
    # _1
    # _2
    

提交回复
热议问题