cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Salesforce Marketing Cloud integration

ceceliac
New Contributor III

What is the best way to get Salesforce Marketing Cloud data into Databricks? Lakeflow / Federation connectors are limited to Salesforce and Salesforce Data Cloud right now. Are there plans to add Salesforce Marketing Cloud?  The only current option we can find is either an extension to FiveTran or using this Python connector: Python Data Stream Retrievals 

Thanks! 

1 REPLY 1

Louis_Frolio
Databricks Employee
Databricks Employee

Hey @ceceliac ,  Thanks for raising this โ€” hereโ€™s the current picture and practical paths you can use today.

 

What Databricks supports today

  • The Lakehouse Federation connector for Salesforce Data Cloud is available and lets you query Data Cloud tables in place, zero-copy, under Unity Catalog governance.
  • The Lakeflow Connect ingestion connector for Salesforce Platform (Sales/Service Cloud) is GA and designed to copy CRM objects into Delta tables with incremental ingestion and UC governance.
  • The Salesforce ingestion connector does not currently support Marketing Cloud; recommended alternative is to route MC data into Data Cloud, then use the Data Cloud connectors from Databricks.

Options to get Marketing Cloud data into Databricks now

  • Route SFMC to Data Cloud, then federate in Databricks
    If your org enables Salesforce Data Cloud and connects Marketing Cloud to it (Salesforce provides MCโ†’Data Cloud integrations), you can use Databricks Lakehouse Federationโ€™s Salesforce Data Cloud connector to query the unified dataset without ingesting it.
This is the most โ€œnativeโ€ zero-copy path with consistent governance and immediate access for analytics and ML in Databricks.
  • Use a partner ELT to ingest SFMC directly (e.g., Fivetran)
    Fivetran provides a managed Salesforce Marketing Cloud connector that can land into Databricks. It supports core entities (EMAIL, SEND, EVENT, LIST, SUBSCRIBER, JOURNEY, etc.) and data extensions, noting that extensions are typically full re-imports due to API limitations.
  • Build a custom pipeline on SFMC APIs (Python)
    Salesforceโ€™s Marketing Cloud Data Streams API Python connector (the link you shared) can be used to pull SFMC data; land the outputs in cloud storage and hydrate Delta via Auto Loader or DLT for incremental processing and governance.
    This approach gives you control over cadence, scope, and schema handling at the cost of more engineering ownership.

Roadmap for a Databricks-managed SFMC connector

  • Weโ€™re actively tracking demand and have an Aha idea for โ€œLakeflow Connect Connector for SFDC Marketing Cloudโ€ and an entry on the Lakeflow Connect timelines showing SFMC โ€œin developmentโ€ (timelines are highly subject to change and not a commitment).
  • Internal guidance notes that Marketing Cloud is not supported in the current ingestion connector, and the recommended near-term path is MCโ†’Data Cloudโ†’Databricks Federation while we continue evaluating native SFMC ingestion demand and requirements.

Recommended architectures and trade-offs

  • Zero-copy (MCโ†’Data Cloudโ†’Federation)
    Best when you already have Data Cloud or plan to; fastest time-to-insight, no ETL, and UC governance. Good for analytics/ML prototyping and production querying; you can materialize when needed for performance or downstream processing.
  • Managed ELT (Fivetran SFMCโ†’Databricks)
    Best when you want data resident in the lakehouse; covers a broad SFMC surface area and includes dbt models for SFMC analytics. Be aware data extensions often require daily full re-imports, which can lengthen syncs and may warrant separating those objects into a dedicated connection.
  • Custom ingestion (Python/Data Streams API)
    Best when you have bespoke needs, want tighter control, or need to minimize vendor dependencies. You own resilience, retries, and schema evolution; Databricks Auto Loader/DLT provide incremental processing and governance once landed.
 
Regards, Louis

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local communityโ€”sign up today to get started!

Sign Up Now