I am new spark and python and facing this difficulty of building a schema from a metadata file that can be applied to my data file. Scenario: Metadata File for the Data file(csv
The attribute df.schema
of a pyspark DataFrame return the StructType.
Given your df:
+--------------------+---------------+
| name| type|
+--------------------+---------------+
| id| IntegerType()|
| created_at|TimestampType()|
| updated_at| StringType()|
Type:
df.schema
Result:
StructType(
List(
StructField(id,IntegerType,true),
StructField(created_at,TimestampType,true),
StructField(updated_at,StringType,true)
)