cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Load an explicit schema from an external metadata.csv file or a json file for reading csv's into dataframe

AnandNair
New Contributor

Hi,

I have a metadata csv file which contains column name, and datatype such as

Colm1: INT

Colm2: String.

I can also get the same in a json format as shown:

I can store this on ADLS. How can I convert this into a schema like: "Myschema" that I can then pass during spark.read.format("csv") method while reading the datafile for the same metadata? When I infer schema for the datafile csv for multiple incremental files , I get clashes while writing into delta such as

"Failed to merge fields 'Colm1' and 'Colm1'. Failed to merge incompatible data types IntegerType and StringType

Any pointers/notes would be appreciated.

Thanks!

0 REPLIES 0

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group