cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

connect databricks to teradata

Rishabh-Pandey
Esteemed Contributor

hey i want to know can we connect databricks to the teradata database and if yes what will be the procedure ??? help would be appreciated

Rishabh Pandey
1 ACCEPTED SOLUTION

Accepted Solutions

User16255483290
Contributor
8 REPLIES 8

User16255483290
Contributor

thanks

Rishabh Pandey

Harshjot
Contributor III

Hi @Rishabh Pandeyโ€‹ just add Teradata JDBC jar to your data bricks cluster.

https://spark.apache.org/docs/latest/sql-data-sources-jdbc.html

thanks

Rishabh Pandey

jose_gonzalez
Databricks Employee
Databricks Employee

thanks

Rishabh Pandey

Would this connection be encrypted end to end?

BroData
New Contributor II

There are two main ways to connect to Teradata from Databricks using Python.

Way 1: Using Python Libraries (e.g., sqlalchemy, pyjdbc, pyodbc, jaydebeapi, and so on)

Pros: Provides a comprehensive solution, allowing us to: Query data, Trigger stored procedures, Perform other advanced database operations.

Cons: Only utilizes the driver node of the Databricks cluster. So, this way does not leverage the full distributed power of the Databricks cluster, which can lead to performance limitations for large datasets.

Way 2: Using PySpark and Spark JDBC API

Step 1: Install the Maven library terajdbc on the Databricks cluster.

Step 2: Read & Write

df = spark.read.format('jdbc') \
.option('driver', 'com.teradata.jdbc.TeraDriver') \
.option('url', 'jdbc:teradata://<host_name>/DBS_PORT=<port_number>,TMODE=ANSI,logmech=ldap') \
.option('user', '<user>') \
.option('password', '<password>') \
.option('query', '<query>') \
.load()

df.display()
df.write.format('jdbc') \
.option('driver', 'com.teradata.jdbc.TeraDriver') \
.option('url', 'jdbc:teradata://<host_name>/DBS_PORT=<port_number>,TMODE=ANSI,logmech=ldap') \
.option('user', '<user>') \
.option('password', '<password>') \
.option('dbtable', '<target_db>.<target_table>') \
.mode('<write_mode>') \
.save()

Note - a. Ensure the URL is correctly configured. b. Provide valid user credentials with appropriate access. c. Ensure <query> or <db_name>.<table_name> is accessible by the <user>.

Pros: Fully utilizes the distributed computing power of the Databricks cluster. So, this way offers excellent performance for reading and writing large datasets.

Cons: Spark JDBC API is primarily for DataFrame-based data I/O, not procedural/transactional logic. So, this way supports limited operations (like, we can't execute stored procedures and some other advanced database operations).

Thanks & Regards,

BroData

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local communityโ€”sign up today to get started!

Sign Up Now