Databricks Community

auser85 · ‎12-16-2022

I am trying to read in files via the COPY INTO command but I am getting this error lately for a certain subset of the data;

`Error while reading file: Schema conversion error: cannot convert Parquet type INT64 to Photon type double`

These are my options; I have tried with a mixture of mergeSchema and overwriteSchema

What might I do to make this more reliable?

              FILEFORMAT = PARQUET
              FORMAT_OPTIONS ('overwriteSchema' = 'true')
              COPY_OPTIONS ('overwriteSchema' = 'true', 'overwrite' = 'true')

Aviral-Bhardwaj · ‎12-16-2022

hey @Andrew Fogarty

I also faced the same issue when I moved from the 7.3 LTS version to a higher runtime version so to mitigate this issue you can use the below cluster configuration

spark.sql.storeAssignmentPolicy LEGACY

spark.sql.parquet.binaryAsString true

spark.speculation false

spark.sql.legacy.timeParserPolicy LEGACY

For a detailed explanation of the above configuration please use this doc. this is really helpful to debug most of your errors

Spark configuration link- https://spark.apache.org/docs/latest/configuration.html

if you like my answer please upvote it.

Thanks

Aviral Bhardwaj

AviralBhardwaj