โ06-25-2025 09:08 AM
I am Creating a Data lakehouse solution on Azure Databricks.
Source : SAP, SALESFORCE, Adobe
Target: Hightouch (External Application), Mad Mobile (External Application)
The datalake house also have transactional records which should be store in ACID property storage.
The real challenge is there is 1 more Databricks instance which is on seperate instance.
and that also required data from Data lakehouse.
May someone please help me how architecture should looks like.
Thanks a lot.
โ06-25-2025 10:59 AM
@Pratikmsbsvm
You can leverage the below one for your architect solution.
Your Setup at a Glance
Sources
SAP, Salesforce, Adobe (Structured & Semi-structured)
Targets
Hightouch, Mad Mobile (External downstream apps needing curated data)
Core Requirement
Data must be stored in ACID-compliant format โ โ Use Delta Lake(Managed will be great/ if there are company constraint external location will work)
Cross-Workspace Data Sharing
Another Databricks instance (separate workspace) needs access to this lakehouse data
[ SAP / Salesforce / Adobe ]
โ
โผ
Ingestion Layer (via ADF / Synapse / Partner Connectors / REST API)
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Azure Data Lake Gen2 โ (Storage layer - centralized)
โ + Delta Lake for ACID โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
Azure Databricks (Primary Workspace)
โโ Bronze: Raw Data
โโ Silver: Cleaned & Transformed
โโ Gold: Aggregated / Business Logic Applied
โ
โโโ> Load to Hightouch / Mad Mobile (via REST APIs / Hightouch Sync)
โโโ> Share curated Delta Tables to Other Databricks Workspace (via Delta Sharing or External Table Mount)
Use Azure Data Factory or Partner Connectors (like Fivetran- We use it often our project) to ingest data from:
SAP โ via OData / RFC connectors
Salesforce โ via REST/Bulk API
Adobe โ via API or S3 data export
Store all raw and processed data in ADLS Gen2, with Delta Lake format
Organize Lakehouse zones:
Bronze: Raw ingested files
Silver: Cleaned & de-duplicated
Gold: Ready for consumption (BI / API sync)
Securely share Delta tables from one workspace to another without copying data
Works across different cloud accounts
Mount/use Service Principal same ADLS Gen2 storage in both Databricks workspaces
Other workspace can directly access tables if permissions are aligned in Groups (Access via Databricks Account Console)
Periodically replicate key Delta tables to the secondary Databricks instance using jobs or autoloader
Use Unity Catalog (if available) for fine-grained access control
Encrypt data at rest (ADLS) and in transit
Use service principals or managed identities for secure access between services
Sources โ Ingestion โ Delta Lakehouse โ Destinations
[SAP, SFDC, Adobe] [ADF, APIs] [Bronze, Silver, Gold] [Hightouch, Mad Mobile, Other DBX]
โฒ
โ
Cross-Workspace Access (Delta Sharing / Mounting / Jobs)
Let me know if this helps ๐
โ06-25-2025 10:59 AM
@Pratikmsbsvm
You can leverage the below one for your architect solution.
Your Setup at a Glance
Sources
SAP, Salesforce, Adobe (Structured & Semi-structured)
Targets
Hightouch, Mad Mobile (External downstream apps needing curated data)
Core Requirement
Data must be stored in ACID-compliant format โ โ Use Delta Lake(Managed will be great/ if there are company constraint external location will work)
Cross-Workspace Data Sharing
Another Databricks instance (separate workspace) needs access to this lakehouse data
[ SAP / Salesforce / Adobe ]
โ
โผ
Ingestion Layer (via ADF / Synapse / Partner Connectors / REST API)
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Azure Data Lake Gen2 โ (Storage layer - centralized)
โ + Delta Lake for ACID โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
Azure Databricks (Primary Workspace)
โโ Bronze: Raw Data
โโ Silver: Cleaned & Transformed
โโ Gold: Aggregated / Business Logic Applied
โ
โโโ> Load to Hightouch / Mad Mobile (via REST APIs / Hightouch Sync)
โโโ> Share curated Delta Tables to Other Databricks Workspace (via Delta Sharing or External Table Mount)
Use Azure Data Factory or Partner Connectors (like Fivetran- We use it often our project) to ingest data from:
SAP โ via OData / RFC connectors
Salesforce โ via REST/Bulk API
Adobe โ via API or S3 data export
Store all raw and processed data in ADLS Gen2, with Delta Lake format
Organize Lakehouse zones:
Bronze: Raw ingested files
Silver: Cleaned & de-duplicated
Gold: Ready for consumption (BI / API sync)
Securely share Delta tables from one workspace to another without copying data
Works across different cloud accounts
Mount/use Service Principal same ADLS Gen2 storage in both Databricks workspaces
Other workspace can directly access tables if permissions are aligned in Groups (Access via Databricks Account Console)
Periodically replicate key Delta tables to the secondary Databricks instance using jobs or autoloader
Use Unity Catalog (if available) for fine-grained access control
Encrypt data at rest (ADLS) and in transit
Use service principals or managed identities for secure access between services
Sources โ Ingestion โ Delta Lakehouse โ Destinations
[SAP, SFDC, Adobe] [ADF, APIs] [Bronze, Silver, Gold] [Hightouch, Mad Mobile, Other DBX]
โฒ
โ
Cross-Workspace Access (Delta Sharing / Mounting / Jobs)
Let me know if this helps ๐
โ06-25-2025 11:44 AM
Hi @Pratikmsbsvm , from what I understand, you have a lakehouse on Azure databricks and would like to share this data with another databricks account or workspace. If Unity Catalog is enabled on your Azure databricks account, you can leverage Delta Sharing to securely share the data with other databricks accounts.
https://docs.databricks.com/aws/en/delta-sharing/
Feel free to post if this does not answer your question or you need any specific details regarding this solution
Passionate about hosting events and connecting people? Help us grow a vibrant local communityโsign up today to get started!
Sign Up Now