โ05-16-2024 05:32 AM
Ive just set up our databricks environment. Hosted in AWS.
We have an on prem SQL server and would like to connect .
How can i do that?
โ05-16-2024 09:32 AM
@Silabs good day!
To connect your Databricks environment (hosted on AWS) to your on-premise SQL server, follow these steps:
1. Network Setup: Establish a connection between your SQL server and the Databricks virtual private cloud (VPC) using VPN or AWS Direct Connect. Test the network connectivity by running the command inside a Databricks notebook: %sh nc -vz <SQL Server IP or hostname> <port>
. Replace <SQL Server IP or hostname>
and <port>
with your SQL Server's details.
2. Install Databricks SQL Connector: On your development machine, install the Databricks SQL Connector for Python library using either pip install databricks-sql-connector
or python -m pip install databricks-sql-connector
.
3. Gather Connection Information: Collect the following information for the Databricks cluster or SQL warehouse: the server hostname of the cluster (available in the Advanced Options > JDBC/ODBC tab for your cluster) and the HTTP path of the cluster (also available in the Advanced Options > JDBC/ODBC tab).
4. Configure a Connection to SQL Server: Use the Databricks SQL Connector and the previously gathered connection information to configure a connection to your SQL Server. Ensure the user or service principal connecting to the Databricks cluster or SQL warehouse has the necessary permissions (CAN ATTACH TO, CAN RESTART, and CAN USE).
I hope this helps. Please let us know if you have any questions or concerns.
Kind regards,
Yesh
โ05-16-2024 06:11 AM
The connection itself can be done using ODBC/JDBC, but that is not the issue.
What you need to achieve is allowing AWS to connect to the on-prem server.
On AWS you can do that using PrivateLink and a VPC Endpoint, or use a VPN.
There might be more though (haven't worked in AWS for a while).
โ05-16-2024 06:44 AM
Two other things we would generally recommend are 1.) Direct Connect and 2.) Using AWS DMS, or any other CDC tool set. There are lots. We have Query Federation now in the Unity Catalog, so you could set up the connection that way, but JDBC/ODBC connections are prone to errors / dropped connections, so for production data engineering tasks its better to load into S3 using a CDC tool first and use Databricks to load the data after. If that's too heavy of a lift, then its pretty easy to use pyodbc and the sql server driver on a cluster once you setup the network access.
โ05-16-2024 09:32 AM
@Silabs good day!
To connect your Databricks environment (hosted on AWS) to your on-premise SQL server, follow these steps:
1. Network Setup: Establish a connection between your SQL server and the Databricks virtual private cloud (VPC) using VPN or AWS Direct Connect. Test the network connectivity by running the command inside a Databricks notebook: %sh nc -vz <SQL Server IP or hostname> <port>
. Replace <SQL Server IP or hostname>
and <port>
with your SQL Server's details.
2. Install Databricks SQL Connector: On your development machine, install the Databricks SQL Connector for Python library using either pip install databricks-sql-connector
or python -m pip install databricks-sql-connector
.
3. Gather Connection Information: Collect the following information for the Databricks cluster or SQL warehouse: the server hostname of the cluster (available in the Advanced Options > JDBC/ODBC tab for your cluster) and the HTTP path of the cluster (also available in the Advanced Options > JDBC/ODBC tab).
4. Configure a Connection to SQL Server: Use the Databricks SQL Connector and the previously gathered connection information to configure a connection to your SQL Server. Ensure the user or service principal connecting to the Databricks cluster or SQL warehouse has the necessary permissions (CAN ATTACH TO, CAN RESTART, and CAN USE).
I hope this helps. Please let us know if you have any questions or concerns.
Kind regards,
Yesh
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group