In pyspark, I have a dataframe with a bit more than 4 millions of rows.
I add a column to the dataframe with the withColumn function. The value of the column for each