Hello, I'm very much new to Databricks and I'm finding it hard if it's right solution for our needs.
Requirement:
We have multiple data sources spread across AWS S3 and Postgres. We need a common SQL endpoint that can be used to write queries to join data across these different stores.
For example:
We have a BI tool that connects to data sources over JDBC. However this BI tool cannot "join" the data across multiple data sources. Can I use Databricks to solve this problem?
In my BI tool, I should be able to connect to Databricks over JDBC and write a SQL query like
SELECT * FROM
S3.Schema1.Table1 AS s,
Postgres.Schema2.Table2 AS p
WHERE s.x = p.y;
And this new Databrick SQL endpoint should be always be available 24*7 just like a normal DB instance. Is this possible?
PS: I'm aware I can "import" Postgres data into S3 and then make joins. But we need real-time joins without importing.