How to get the schema definition from a dataframe in PySpark?

前端 未结 4 2040
萌比男神i
萌比男神i 2021-02-12 14:25

In PySpark it you can define a schema and read data sources with this pre-defined schema, e. g.:

Schema = StructType([ Str         


        
4条回答
  •  梦谈多话
    2021-02-12 14:32

    The code below will give you a well formatted tabular schema definition of the known dataframe. Quite useful when you have very huge number of columns & where editing is cumbersome. You can then now apply it to your new dataframe & hand-edit any columns you may want to accordingly.

    from pyspark.sql.types import StructType
    
    schema = [i for i in df.schema] 
    

    And then from here, you have your new schema:

    NewSchema = StructType(schema)
    

提交回复
热议问题