cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Community Discussions
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

How connect connection ORACLE onpremise in Databricks for extract data

Kaviana
New Contributor III

Hello, 

How Connect in Databricks Enterprise connection Oracle Onpremise and that permissions is necessary.

Thank you

 

 

2 REPLIES 2

Kaniz
Community Manager
Community Manager

Hi @Kaviana , You can connect Databricks Enterprise to an Oracle On-premise database using the cx_Oracle Python module.


- To Install Oracle Client libraries, follow these steps:


 - Download the Oracle Instant Client Basic Light Package.
 - Unzip the contents to a folder.
 - Upload the Instant Client folder to a cluster.
 - Copy the Instant Client folder to a system directory.
 - Set the environment variables LD_LIBRARY_PATH and ORACLE_HOME.
 - Install cx_Oracle from PyPI.
 - Restart the cluster.
- Automate the steps using an init script.

Here is a template:

python
%python
dbutils.fs.put("dbfs:/databricks/<init-script-folder>/oracle_ctl.sh","""
#!/bin/bash
wget --quiet -O /tmp/instantclient-basiclite-linuxx64.zip https://download.oracle.com/otn_software/linux/instantclient/instantclient-basiclite-linuxx64.zip
unzip /tmp/instantclient-basiclite-linuxx64.zip -d /databricks/driver/oracle_ctl/
sudo echo 'export LD_LIBRARY_PATH="/databricks/driver/oracle_ctl/"' >> /databricks/spark/conf/spark-env.sh
sudo echo 'export ORACLE_HOME="/databricks/driver/oracle_ctl/"' >> /databricks/spark/conf/spark-env.sh
""", True)

- Configure the init script as a cluster-scoped init script.
- Install the cx_Oracle library as a cluster-installed library and restart your cluster.


- Permissions needed: "Can Attach To" permission to connect to the running cluster and "Can Restart" permission to trigger the cluster to start if its state is terminated.

Kaviana
New Contributor III

Hi @Kaniz , how are you?

Thanks for your answers. I am trying to make this configuration you mention, I am telling you. The idea is to try to prove a Oracle connection to an IP of a client's development, for now in AWS the VPC was created and the tunnel is active, it does not know how it can be linked and seen in connecting with Databricks.