Is there any python library that can be used to just get the schema of a parquet file?
Currently we are loading the parquet file into dataframe in Spark and getting schem
In addition to the answer by @mehdio, in case your parquet is a directory (e.g. a parquet generated by spark), to read the schema / column names:
import pyarrow.parquet as pq
pfile = pq.read_table("file.parquet")
print("Column names: {}".format(pfile.column_names))
print("Schema: {}".format(pfile.schema))