问题
I want to start value of indexes in data frame with certain value instead of default value zero, if there is any parameter we can use for zipWithIndex() in pyspark.
回答1:
the following solution will help to start zipwithIndex with default value.
df = df_child.rdd.zipWithIndex().map(lambda x: (x[0], x[1] + index)).toDF()
where index is default number you want to start with zipWithIndex.
来源:https://stackoverflow.com/questions/60124599/start-index-with-certain-value-zipwithindex-in-pyspark