โ04-24-2023 11:56 PM
I am in a situation where I have a notebook that runs in a pipeline that creates a "live streaming table". So, I cannot use a language other than sql in the pipeline. I would like to format a certain column in the pipeline using a scala code (it's a complicated formatting and difficult to replicate in SQL).
Spark allows you to register scala methods as udf and access those registered methods in SQL.
But given my current situation (pipeline with DLT), I cannot include the scala method and the statement to register the method in spark context in the notebook.
Is there any work around here?
โ04-25-2023 07:36 AM
no, DLT does not work with Scala unfortunately.
Delta Live Tables are not vanilla spark.
Is python an option instead of scala?
โ04-25-2023 11:57 PM
yes, python is an option if I can use the library https://pypi.org/project/phonenumbers/
โ04-26-2023 01:45 AM
afaik you can create python udf's, but somehow I do not find the docs anymore.
https://docs.databricks.com/workflows/delta-live-tables/delta-live-tables-cookbook.html
and
But they seem to be removed. If someone knows where to find these...
Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections.
Click here to register and join today!
Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.