Hubert-Dudek
Databricks MVP

As @Werner Stinckens​ said.

Just load your file the normal way (spark.read.parquet ) without specifying schema and then extract DDL.

schema_json = spark.read.parquet("your_file.parquet").schema.json()
ddl = spark.sparkContext._jvm.org.apache.spark.sql.types.DataType.fromJson(schema_json).toDDL()
print(ddl)


My blog: https://databrickster.medium.com/

View solution in original post