I\'m having an error when trying to cast a StringType to a IntType on a pyspark dataframe:
joint = aggregates.join(df_data_3,aggregates.year==df_data_3.year)
PySpark SQL data types are no longer (it was the case before 1.3) singletons. You have to create an instance:
from pyspark.sql.types import IntegerType
from pyspark.sql.functions import col
col("foo").cast(IntegerType())
Column
In contrast to:
col("foo").cast(IntegerType)
TypeError
...
TypeError: unexpected type:
cast
method can be also used with string descriptions:
col("foo").cast("integer")
Column