07-31-2024 09:36 PM
Hi,
I have a table with Variant type (preview) and works well in 15.3, when i try to run a code that reference this Variant type in a DLT pipeline i get :
when i try to change the policy and define Databricks Runtime Version to be 15.3, i get :
INVALID_PARAMETER_VALUE: [DLT ERROR CODE: INVALID_CLUSTER_SETTING.CLIENT_ERROR] The cluster policy specified in the pipeline settings is not compatible with Delta Live Tables. Remove 'spark_version’ from your cluster policy.
Please advise!
08-01-2024 12:43 AM - edited 08-01-2024 12:44 AM
Hi @udi_azulay ,
That's because DLT currently sits on lower version. Look at release notes:
https://docs.databricks.com/en/release-notes/delta-live-tables/2024/22/index.html
Databricks Runtime versions used by this release
Channel:
CURRENT (default): Databricks Runtime 14.1
PREVIEW: Databricks Runtime 14.3
08-13-2024 04:34 AM
Preview channel version currently is at 15.2. So we should be only one minor version increment away from variant being available in DLT (at least i hope so...).
01-23-2025 06:19 AM - edited 01-23-2025 06:38 AM
Now time has progressed, the current DLT runtime version (both CURRENT and PREVIEW) are based on Databricks Runtime version 15.4 as of December 2024. ([https://docs.databricks.com/en/release-notes/delta-live-tables/2024/49/index.html](https://docs.data...))
The VARIANT data type should thus be supported as it requires at least runtime version 15.3 ([https://docs.databricks.com/en/ingestion/variant.html)](https://docs.databricks.com/en/ingestion/var...)). As such, I expect variant columns to be supported when using a DLT pipeline. Even more, since the documentation does have no reference to unsupported data types in relation to DLT pipelines (https://learn.microsoft.com/en-us/azure/databricks/delta-live-tables/limitations)
However, when I define a DLT pipeline that ingests a message from Azure EventHub using the kafka connector, I get an error about unsupported features being used on the table. The error message suggests that this relates to an internal table as the name starts with __materialization_mat.... As a result, the table properties cannot be modified as suggested in the error message. Doing a full refresh of the tables associated with the pipeline does not resolve the issue.
As such I believe that VARIANT columns are still not supported in combination with DLT pipelines even though the runtime versions and documentation suggest otherwise.
3 weeks ago
@MAJVeld We got it to work.
Let me know if this works for you. We're not using serverless right now, so I can't speak to that, but I had to set the channel to preview instead of current for it to work (and we're using photon, but that probably doesn't matter).
And again, after the create statement we had to set the tbl properties. Sounds like the table properties are what matter to you. You should be able to set it within the table definition. See screenshot.
3 weeks ago
I will echo this. I can run my materialized view when it's not attached to the pipeline (as long as I have this: "
2 weeks ago
I can indeed confirm that adding some additional table properties to the @Dlt attribute in the DLT pipeline definition resolved the earlier issues. Thanks for pointing this out.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group