Databricks Community

NotARobot · ‎02-12-2024

Trying to do a url_decode on a column, which works great in development, but running via DLT fails when trying multiple ways.

1. pyspark.sql.functions.url_decode - This is new as of 3.5.0, but isn't supported using whatever version running a DLT pipeline provides. I haven't been able to figure out what version of PySpark this is actually running. It says 12.2, but I suspect that might actually be the version of something else:
dlt:12.2-delta-pipelines-dlt-release-2024.04-rc0-commit-24b74

2. Attempt to use a simple UDF that wraps urllib.parse.unquote_plus, however this appears to be unsupported with Unit Catalog. Given the documentation states that this should be supported in versions greater than 13.1, again guessing the version is why I get this error:
pyspark.errors.exceptions.AnalysisException: [UC_COMMAND_NOT_SUPPORTED] UDF/UDAF functions are not supported in Unity Catalog

3. Have also tried to use cluster policies to attempt to set the version, however regardless of what version this attempts to force, cluster gets the same version as above. Have tried using regex, explicit version, and auto:latest with no luck.

This leads to two questions:
1. What version of PySpark is DLT running and how can users consistently find this to know what is available for use?
2. How do users force versions if cluster policies don't work?
3. Any other recommendations for doing a URL decode via DLT, since this is where the rest of our ETL pipeline is running, would prefer to not fragment out tables into separate workflows to manage.

NotARobot · ‎02-15-2024

Thanks @Retired_mod, for reference if anybody finds this, the DLT release docs are here: https://docs.databricks.com/en/release-notes/delta-live-tables/index.html
This shows which versions are running for CURRENT and PREVIEW channels. In this case, was running on CURRENT channel (Spark 3.3.2), so PREVIEW channel (Spark 3.5.0) should work for the latest PySpark functions.

Databricks Community

Delta Live Tables UDFs and Versions

Photos

Join Us as a Local Community Builder!

Exciting Opportunity to Collaborate with Us!

Intelligent Data Warehousing: AI/BI for Self-service Analytics

Share Your Thoughts on Databricks & Get Rewarded!

Get Started With Lakehouse Architecture | Pass a quiz to earn your certificate completion.

Virtual Learning Festival: 9 April - 30 April