Hey @ceceliac , Thanks for raising this โ hereโs the current picture and practical paths you can use today.
What Databricks supports today
- The Lakehouse Federation connector for Salesforce Data Cloud is available and lets you query Data Cloud tables in place, zero-copy, under Unity Catalog governance.
-
The Lakeflow Connect ingestion connector for Salesforce Platform (Sales/Service Cloud) is GA and designed to copy CRM objects into Delta tables with incremental ingestion and UC governance.
-
The Salesforce ingestion connector does not currently support Marketing Cloud; recommended alternative is to route MC data into Data Cloud, then use the Data Cloud connectors from Databricks.
Options to get Marketing Cloud data into Databricks now
- Route SFMC to Data Cloud, then federate in Databricks
If your org enables Salesforce Data Cloud and connects Marketing Cloud to it (Salesforce provides MCโData Cloud integrations), you can use Databricks Lakehouse Federationโs Salesforce Data Cloud connector to query the unified dataset without ingesting it.
This is the most โnativeโ zero-copy path with consistent governance and immediate access for analytics and ML in Databricks.
-
Use a partner ELT to ingest SFMC directly (e.g., Fivetran)
Fivetran provides a managed Salesforce Marketing Cloud connector that can land into Databricks. It supports core entities (EMAIL, SEND, EVENT, LIST, SUBSCRIBER, JOURNEY, etc.) and data extensions, noting that extensions are typically full re-imports due to API limitations.
-
Build a custom pipeline on SFMC APIs (Python)
Salesforceโs Marketing Cloud Data Streams API Python connector (the link you shared) can be used to pull SFMC data; land the outputs in cloud storage and hydrate Delta via Auto Loader or DLT for incremental processing and governance.
This approach gives you control over cadence, scope, and schema handling at the cost of more engineering ownership.
Roadmap for a Databricks-managed SFMC connector
- Weโre actively tracking demand and have an Aha idea for โLakeflow Connect Connector for SFDC Marketing Cloudโ and an entry on the Lakeflow Connect timelines showing SFMC โin developmentโ (timelines are highly subject to change and not a commitment).
- Internal guidance notes that Marketing Cloud is not supported in the current ingestion connector, and the recommended near-term path is MCโData CloudโDatabricks Federation while we continue evaluating native SFMC ingestion demand and requirements.
Recommended architectures and trade-offs
- Zero-copy (MCโData CloudโFederation)
Best when you already have Data Cloud or plan to; fastest time-to-insight, no ETL, and UC governance. Good for analytics/ML prototyping and production querying; you can materialize when needed for performance or downstream processing.
-
Managed ELT (Fivetran SFMCโDatabricks)
Best when you want data resident in the lakehouse; covers a broad SFMC surface area and includes dbt models for SFMC analytics. Be aware data extensions often require daily full re-imports, which can lengthen syncs and may warrant separating those objects into a dedicated connection.
-
Custom ingestion (Python/Data Streams API)
Best when you have bespoke needs, want tighter control, or need to minimize vendor dependencies. You own resilience, retries, and schema evolution; Databricks Auto Loader/DLT provide incremental processing and governance once landed.
Regards, Louis