Pyspark: How to add ten days to existing date column

后端 未结 1 1318
南方客
南方客 2021-01-18 01:21

I have a dataframe in Pyspark with a date column called \"report_date\".

I want to create a new column called \"report_date_10\" that is 10 days added to the origi

1条回答
  •  梦毁少年i
    2021-01-18 02:14

    It seems you are using the pandas syntax for adding a column; For spark, you need to use withColumn to add a new column; For adding the date, there's the built in date_add function:

    import pyspark.sql.functions as F
    df_dc = spark.createDataFrame([['2018-05-30']], ['report_date'])
    
    df_dc.withColumn('report_date_10', F.date_add(df_dc['report_date'], 10)).show()
    +-----------+--------------+
    |report_date|report_date_10|
    +-----------+--------------+
    | 2018-05-30|    2018-06-09|
    +-----------+--------------+
    

    0 讨论(0)
提交回复
热议问题