cancel
Showing results for 
Search instead for 
Did you mean: 
Warehousing & Analytics
Engage in discussions on data warehousing, analytics, and BI solutions within the Databricks Community. Share insights, tips, and best practices for leveraging data for informed decision-making.
cancel
Showing results for 
Search instead for 
Did you mean: 

Shared Cluster running pushdown queries in Azure SQL

Rahul_Lalwani
New Contributor II

I just want to run push down queries for example update or run procedures into azure sql server using shared cluster.

We have built modules which were running without any issues using spark gateway to create a connection to azure sql and running update and push down queries.

This functionality has been blacklisted in shared cluster.

RLS/CLS enabled tables can only be accessed from shared cluster and modules we have created uses push down in azure sql functionality for some use cases.

Is there any way to run push down queries from shared cluster to Azure SQL.

I know spark read and write works but this is my use case we will migrate it in future but there has to be some workaround.

I tried using pyodbc but I also wasn't able install ODBC driver for Microsoft SQL even through init scripts as it gives error when not ran from sudo and using sudo it gives error to provide password as it's a shared cluster.

I have been trying to find a workaround from 3 weeks but couldn't found one as it is becoming a bigger issue for me as stuck with below contradicting issues.

1. RLS/CLS tables isn't accessible from interactive clusters or job cluster

2. Azure SQL pushdown queries isn't supported from shared cluster.

2 REPLIES 2

SathyaSDE
New Contributor II

Hi,

Did you try like below ?

# Define the JDBC connection properties
jdbc_url = "jdbc:sqlserver://<your_server>.database.windows.net:1433;database=<your_database>"
connection_properties = {
    "user": "<your_username>",
    "password": "<your_password>",
    "driver": "com.microsoft.sqlserver.jdbc.SQLServerDriver"
}

# Write the DataFrame to the Azure SQL table
dataframe.write.jdbc(url=jdbc_url, table="<your_table>", mode="overwrite", properties=connection_properties)

This is for overwriting the table, I want to run push down query for example UPDATE statement and procedures. Which I was doing before by invoking the connection through spark driver manager which is now blocked in shared clusters. Final plan is to use the overwrite but it requires a lot of change to current module at the moment I want a workaround for running queries through connection from shared cluster so that atleast RLS/CLS enabled tables can be accessible.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group