Load an explicit schema from an external metadata.csv file or a json file for reading csv's into dataframe

Data Engineering

Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.

Hi,

I have a metadata csv file which contains column name, and datatype such as

Colm1: INT

Colm2: String.

I can also get the same in a json format as shown:

I can store this on ADLS. How can I convert this into a schema like: "Myschema" that I can then pass during spark.read.format("csv") method while reading the datafile for the same metadata? When I infer schema for the datafile csv for multiple incremental files , I get clashes while writing into delta such as

"Failed to merge fields 'Colm1' and 'Colm1'. Failed to merge incompatible data types IntegerType and StringType

Any pointers/notes would be appreciated.

Thanks!