Databricks Community

Rahul_Lalwani · 3 weeks ago

I just want to run push down queries for example update or run procedures into azure sql server using shared cluster.

We have built modules which were running without any issues using spark gateway to create a connection to azure sql and running update and push down queries.

This functionality has been blacklisted in shared cluster.

RLS/CLS enabled tables can only be accessed from shared cluster and modules we have created uses push down in azure sql functionality for some use cases.

Is there any way to run push down queries from shared cluster to Azure SQL.

I know spark read and write works but this is my use case we will migrate it in future but there has to be some workaround.

I tried using pyodbc but I also wasn't able install ODBC driver for Microsoft SQL even through init scripts as it gives error when not ran from sudo and using sudo it gives error to provide password as it's a shared cluster.

I have been trying to find a workaround from 3 weeks but couldn't found one as it is becoming a bigger issue for me as stuck with below contradicting issues.

1. RLS/CLS tables isn't accessible from interactive clusters or job cluster

2. Azure SQL pushdown queries isn't supported from shared cluster.

SathyaSDE · 3 weeks ago

Hi,

Did you try like below ?

# Define the JDBC connection properties

jdbc_url = "jdbc:sqlserver://<your_server>.database.windows.net:1433;database=<your_database>"

connection_properties = {

"user": "<your_username>",

"password": "<your_password>",

"driver": "com.microsoft.sqlserver.jdbc.SQLServerDriver"

}

# Write the DataFrame to the Azure SQL table

dataframe.write.jdbc(url=jdbc_url, table="<your_table>", mode="overwrite", properties=connection_properties)

Rahul_Lalwani · 3 weeks ago

This is for overwriting the table, I want to run push down query for example UPDATE statement and procedures. Which I was doing before by invoking the connection through spark driver manager which is now blocked in shared clusters. Final plan is to use the overwrite but it requires a lot of change to current module at the moment I want a workaround for running queries through connection from shared cluster so that atleast RLS/CLS enabled tables can be accessible.