Thank you for your response, @Retired_mod!!
While the answer makes sense, I haven't been able to figure out "how" one would do that, in particular (from your answer):
- While Databricks Runtime doesnโt include every library out of the box, you can still declare and use additional libraries within your Python UDF code.
- For your XML validation use case, you can import the necessary libraries directly in your Python UDF code.
I mean, I can import a library within the UDF's body, BUT how do I make that specific library available to the compute component of the cluster if it is NOT part of a Runtime?? (plus, I only see defusedxml library in 13.3 LTS, with deprecated .lxml module, now erroring out, lxml in 14.3 LTS though not fit for purpose, and NO Python xml library onwards, e.g. Runtime 15 series).
Would you so kind as to provide some examples of how to bring a specific library in?
Most examples I came across just do imports of built in libs.... but if I want a library not in a Runtime, the only way I know is declaring it as part of a cluster's compute, which is not the same as using Serverless.
Please correct me if I am wrong on any of these, as I am not fully familiarized with Serverless warehouses.
An example would be much appreciated!!
Thank you in advance!!