cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Databricks Python stored procedures

RGSLCA
New Contributor

Hi,

I am using databricks runtime 17.3.x-scala2.13 , 

 

But I am unable to create python stored procedures, (functions are possible but they dont support a spark session like below) , any thoughts/help is much appreciated ?
 
[INVALID_STATEMENT_OR_CLAUSE] The statement or clause: CREATE PROCEDURE ... LANGUAGE PYTHON is not valid. SQLSTATE: 42601 == SQL (line 2, position 1) == LANGUAGE PYTHON ^^^^^^^^^^^^^^^
 
This is the code I am trying
 
CREATE OR REPLACE PROCEDURE test_cat.rg_test_landing.get_max_table_version(IN table_name STRING, OUT p_max_version BIGINT)
LANGUAGE PYTHON
SQL SECURITY INVOKER
AS $$
   
    df = spark.sql(f"DESCRIBE HISTORY {tbl_name}")
 
    target_operations = ["WRITE", "MERGE", "UPDATE", "DELETE"]
    filtered_df = df.filter(df.operation.isin(target_operations))
 
    if filtered_df.count() > 0:
        max_version = filtered_df.select("version").orderBy("version", ascending=False).first()[0]
        --print(max_version)
        p_max_version = max_version
    else:
        p_max_version = None
 
$$;
 
I also found the following Youtube video as well https://www.youtube.com/watch?v=f4TxNBfSNqM
 
Thanks!
Python stored procedures allow for the integration of Python code within Databricks SQL, combining Python's ease of use with Databricks SQL's powerful data processing capabilities. Users can now write, store, and execute Python scripts as part of their SQL workflows, making it easier to handle ...
1 ACCEPTED SOLUTION

Accepted Solutions

balajij8
Contributor III

You can use the language SQL instead of PYTHON as its the supported language for stored procedures.

SQL stored procedures are good for scripting & creating modular SQL based workflows within boundary of Unity Catalog. It's a secure governed way to execute SQL without leaving warehouse.

However, if the workflows need advanced code logic, complex loops or manipulation of the Spark session, you can use databricks notebook running Python spark code as it gives access to the full power of Spark helping you create scalable & complex data pipelines that SQL cannot support.

More details here

View solution in original post

4 REPLIES 4

balajij8
Contributor III

You can use the language SQL instead of PYTHON as its the supported language for stored procedures.

SQL stored procedures are good for scripting & creating modular SQL based workflows within boundary of Unity Catalog. It's a secure governed way to execute SQL without leaving warehouse.

However, if the workflows need advanced code logic, complex loops or manipulation of the Spark session, you can use databricks notebook running Python spark code as it gives access to the full power of Spark helping you create scalable & complex data pipelines that SQL cannot support.

More details here

szymon_dybczak
Esteemed Contributor III

Hi @RGSLCA ,

This video is a bit misleading - look at the comments section. This feature was not released and as of now you can only create stored procedure using an SQL language.

naveen0808
New Contributor

Databricks stored procedures currently support LANGUAGE SQL, not LANGUAGE PYTHON, so that error is expected. Docs show CREATE PROCEDURE supports LANGUAGE SQL only, on DBR 17.0+ with Unity Catalog.

emma_s
Databricks Employee
Databricks Employee

Hi

As others have said stored procedures don't currently support Python. You can either create the stored procedure with SQL using Windows functions and describe history or put it into a notebook and not have stored procedures. The video is about experimental features.

Current docs: 

Thanks,

Emma