sharikrishna26.medium.com

SIRIGIRI
Databricks Partner

Spark Dataframes Schema

Schema inference is not reliable.

We have the following problems in schema inference:

  1. Automatic inferring of schema is often incorrect
  2. Inferring schema is additional work for Spark, and it takes some extra time
  3. Schema inference is conflicting with the schema validation

4. It might also change the column order

We have two approaches to do it.

  1. Schema DDL String
  2. Struct Type Object

Further Detailed description please refer this link

https://sharikrishna26.medium.com/spark-dataframes-schema-6fe1f90a56c

Please like,share,comment

Happy New year 2023