I want to build an LLM-driven chatbot using Agentic AI framework within Databricks. The idea is for the LLM to generate a SQL text string which then passed to a Unity Catalog-registered Python UDF tool. Within this tool, I need the SQL to be executed (based on the SQL text string it receives from the LLM) so I can immediately run a machine learning model on the returned data. I am avoiding passing data directly to the Python UDF tool to avoid blowing token limits. This is the main reason for me trying to just pass a SQL text string to the Python UDF and have the SQL run within that.
However, any attempt to call spark.sql() or instantiate a SparkSession in my SQL-defined Python UDF fails under the SafeSpark sandbox (there is no global spark available, and SparkContext creation is blocked).
Is there a supported way for a SQL-defined Python UDF to invoke Spark SQL directly inside Unity Catalog?
If not, what production-quality patterns let me register a “query-driven” Python function in UC—one that takes only a SQL string and under the hood fetches the DataFrame and applies ML logic?
Similar questions have been asked before but without satisfactory resolution
Any pointers or examples would be greatly appreciated!