cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Can I use Databricks to join data from S3 and Postgres using SQL?

venkyv
New Contributor II

Hello, I'm very much new to Databricks and I'm finding it hard if it's right solution for our needs.

Requirement:

We have multiple data sources spread across AWS S3 and Postgres. We need a common SQL endpoint that can be used to write queries to join data across these different stores.

For example:

We have a BI tool that connects to data sources over JDBC. However this BI tool cannot "join" the data across multiple data sources. Can I use Databricks to solve this problem?

In my BI tool, I should be able to connect to Databricks over JDBC and write a SQL query like

SELECT * FROM 
S3.Schema1.Table1 AS s,
Postgres.Schema2.Table2 AS p
WHERE s.x = p.y;

And this new Databrick SQL endpoint should be always be available 24*7 just like a normal DB instance. Is this possible?

PS: I'm aware I can "import" Postgres data into S3 and then make joins. But we need real-time joins without importing.

1 ACCEPTED SOLUTION

Accepted Solutions

Hubert-Dudek
Esteemed Contributor III

Yes you can. You can ETL to data lake storage register your tables to metastore and register your SELECT with JOINS as VIEW or even better create additionally jobs and store your JOINED table. From BI you can connect to databricks sql or to data lake storage.

View solution in original post

1 REPLY 1

Hubert-Dudek
Esteemed Contributor III

Yes you can. You can ETL to data lake storage register your tables to metastore and register your SELECT with JOINS as VIEW or even better create additionally jobs and store your JOINED table. From BI you can connect to databricks sql or to data lake storage.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group