sharikrishna26.medium.com

SIRIGIRI · ‎12-31-2022

Spark Dataframes Schema

Schema inference is not reliable.

We have the following problems in schema inference:

Automatic inferring of schema is often incorrect
Inferring schema is additional work for Spark, and it takes some extra time
Schema inference is conflicting with the schema validation

4. It might also change the column order

We have two approaches to do it.

Schema DDL String
Struct Type Object

Further Detailed description please refer this link

https://sharikrishna26.medium.com/spark-dataframes-schema-6fe1f90a56c

Please like,share,comment

Happy New year 2023

Rishabh-Pandey · ‎12-31-2022

Thanks for sharing

Rishabh Pandey

Aviral-Bhardwaj · ‎01-01-2023

good post thanks

AviralBhardwaj

Varshith · ‎01-01-2023

one other difference between those 2 approaches is that In Schema DDL String approach we use STRING, INT etc.. But In Struct Type Object approach we can only use Spark datatypes such as StringType(), IntegerType(), etc..