topic Delta Live Table with Autoloader issue in Data Engineering

Delta Live Table with Autoloader issue

Jfoxyyc — Fri, 03 Feb 2023 04:43:44 GMT

Using autoloader, I'm reading daily data partitioned by well. The data has a specific schema, but if there's no value for a column it isn't present in the json. For a specific column on a specific table I'm getting an error like:

Cannot convert long type to double type on merge.

If I've specified the schema on load in the dlt function, why would it be throwing this? If I read the entire partition using df.read.json(path) it works fine, if I read it using df.read.format(cloudfiles).load(path) it fails due to the merge issue.

The column has some whole integers like 0 and 1 and decimals like 1.23456. I'm thinking what's happening is I have some wells returning a file for a partition with entirely integer numbers. Still stumped on why it might be inferring schema over taking specified schema. Even if it was inferring schema, it's supposed to read the first 1000 files or 50gb of data, and there would never be that many with only long type.

Re: Delta Live Table with Autoloader issue

Murthy1 — Tue, 07 Feb 2023 16:20:47 GMT

Hello!

You can override the inferred schema by providing schema hints.

.option("cloudFiles.schemaHints", "name string, age int")

For your situation , I guess the following should work

.option("cloudFiles.schemaHints", "<column name> long")

Re: Delta Live Table with Autoloader issue

Jfoxyyc — Fri, 10 Feb 2023 20:04:23 GMT

The column is a double, and there's some longs in it, so I'm hoping schemaHints column_name double works. I'll test it out on a sample dataset where I think it should fail.

Re: Delta Live Table with Autoloader issue

Anonymous — Sat, 08 Apr 2023 07:35:02 GMT

Hi @Jordan Fox

Hope everything is going great.

Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we can help you.

Cheers!