Today, consumers leverage technology to enrich their shopping encounters through digital engagements, AI-driven interactions, and other digital channels before completing a purchase. On the other hand, sellers often need more technological support while consumers are empowered, resulting in a notable disparity. A modern seller experience platform should strive to enable sellers equally or more than the consumer. It should afford sellers access to all requisite data points and touchpoints, encompassing visual representations of their sales objectives, intelligent analytics, and sales-assist chatbots or AI-driven recommendations.
Databricks and Salesforce— leaders in the data, AI, and customer relationship management (CRM) fields, offer compelling interplays that can revolutionize the seller experience. The Databricks Data Intelligence Platform allows your entire organization to use data and AI. Built on an open data lakehouse, it provides an open, unified foundation that can process and analyze large amounts of data efficiently and understands the unique semantics of your data. Partnering Databricks with Salesforce, the leading CRM platform, can result in a potent combination that streamlines sales processes and provides a modern seller experience. By combining Databricks' advanced analytics and machine learning models with Salesforce's CRM data, sellers can gain deeper insights into customer behaviors, predict sales trends, and personalize their engagement strategies more effectively.
In this blog, we will review how working with Databricks and Salesforce can benefit an organization's sales and marketing teams. As shown in the diagram (see above), the three key areas that Databricks enables when integrating with sales and marketing systems are:
In this example, Salesforce is the sales and marketing experience layer, while Databricks is the Data and AI processing layer. The processed data or insights can be served across multiple channels, including dashboards, reports, custom applications, and APIs. They can also be shared externally via a Databricks data-sharing mechanism.
Integrating diverse data sources—such as Salesforce Sales and Marketing Cloud, Salesforce Data Cloud, Seismic, Gong, and other sales and marketing data sources—into a unified and open platform, Databricks, enables organizations to have a holistic view of customer behavior and campaign performance. Therefore, sales and marketing teams are empowered to conduct deep, real-time analysis, monitor the effectiveness of their programs, adjust strategies promptly, and optimize return on investment.
Databricks recently introduced native data connectors for seamless integration with Salesforce. These connectors enable customers to access and derive insights from their data in Salesforce CRM and Data Cloud from Databricks with LakeFlow Connect and Lakehouse Federation. Check out this blog for more information.
Databricks LakeFlow Connect offers simple and efficient data ingestion for databases, file sources, and enterprise applications. The LakeFlow Connect Salesforce Connector enables easy ingestion of Salesforce Sales data into Databricks and joins CRM insights with data in the Databricks Data Intelligence Platform, allowing data teams to deliver additional insights and more accurate predictions. LakeFlow Connect is simple to set up and maintain, is governed by Databricks Unity Catalog, and supports incremental data processing. This prevents customers from managing data pipeline infrastructure and creating complex logic for incremental data updates and merges.
While LakeFlow Connect ingests the data into Databricks, Salesforce Data Cloud Connector, powered by Databricks Lakehouse Federation, allows customers to discover, query, and govern Salesforce data from Databricks without data migration. Currently, the Data Cloud Connector leverages JDBC and Databricks plans to add support for Delta Lake UniForm File Federation to enable larger-scale data sharing across platforms. With these approaches, BYOL (Bring-Your-Own-Lake), data in the Salesforce Marketing Cloud can also be ingested into Databricks via the Salesforce Data Cloud Connector. Both LakeFlow Connect and BYOL methods provide simple and flexible options for working with Salesforce data from Databricks, empowering customers to select the best choice for their needs.
Here’s an illustration of combining Salesforce, Gong, and Google Analytics datasets into Databricks and how Databricks enables lead attribution for the Sales teams. The raw/source datasets from Salesforce, Gong, and Google Analytics can be replicated into Databricks with LakeFlow Connect, JDBC, or via 3rd party ETL tools.
Leads Dataset
Lead ID |
Name |
Source |
Status |
Score |
Owner |
L001 |
Jane Doe |
Website Form |
Qualified |
85 |
Alex Smith |
L002 |
John Smith |
Trade Show |
New |
40 |
Lisa Brown |
L003 |
Emily White |
Email Campaign |
Contacted |
70 |
Alex Smith |
Opportunities Dataset
Opportunity ID |
Lead ID |
Stage |
Amount |
Close Date |
Probability (%) |
O001 |
L001 |
Negotiation |
50,000 |
2024-12-15 |
70 |
O002 |
L002 |
Proposal |
30,000 |
2024-12-20 |
50 |
O003 |
L003 |
Discovery |
10,000 |
2024-12-25 |
30 |
Call Transcripts and Insights
Call ID |
Opportunity ID |
Sentiment |
Keywords |
Duration (mins) |
Speaker Ratio (Rep/Client) |
C001 |
O001 |
Positive |
Pricing, ROI |
45 |
60:40 |
C002 |
O002 |
Neutral |
Competitors |
30 |
70:30 |
C003 |
O003 |
Negative |
Budget, Timeline |
20 |
50:50 |
Deal Progression
Opportunity ID |
Risk Indicator |
Last Activity Date |
Next Steps |
O001 |
None |
2024-12-05 |
Follow-up meeting on 2024-12-10 |
O002 |
Stalled for 7 days |
2024-11-28 |
Email to confirm proposal |
O003 |
Low engagement |
2024-11-30 |
Schedule a discovery call |
Traffic Metrics
Visitor ID |
Source/Medium |
Pages Viewed |
Session Duration (mins) |
Device Type |
V001 |
Google/Organic |
5 |
10 |
Mobile |
V002 |
LinkedIn/Paid |
3 |
7 |
Desktop |
V003 |
Direct/None |
8 |
20 |
Tablet |
Conversion Metrics
Visitor ID |
Conversion Type |
Conversion Value |
Conversion Date |
V001 |
Form Submission |
0 |
2024-12-01 |
V002 |
E-commerce Purchase |
2000 |
2024-12-02 |
V003 |
Whitepaper Download |
0 |
2024-12-03 |
Once the raw data from the sales and marketing systems is in Databricks as streaming tables, data engineers can declaratively configure and build DLT (Delta Live Tables) pipelines. The DLT pipelines transform the data from multiple raw sources into business aggregated views such as materialized views to keep the data fresh. Any changes to data from the source systems will be processed incrementally and made available to the business aggregated views (referred to as the silver and gold layer in the Medallion lakehouse architecture). Data engineers can also set up data quality constraints in the DLT pipelines to ensure data quality. The DLT pipelines that materialize the views are managed by Databricks, relieves the customers from managing data pipeline infrastructure. The incremental data processing from source(Salesforce Sales Cloud or bronze tables) to destination (materialized views) ensures a near real-time capture of sales and marketing data, transforming them into business insights. In the below example of a business aggregated view, Salesforce leads are matched with their first website visit tracked in Google Analytics and enriched with Gong conversation insights. With this derived insight, each lead's attribution score can be calculated and ranked.
Lead ID |
Name |
Source/Medium |
Pages Viewed |
Call Sentiment |
Stage |
Amount |
Engagement Score |
L001 |
Jane Doe |
Google/Organic |
5 |
Positive |
Negotiation |
50,000 |
70 |
L002 |
John Smith |
LinkedIn/Paid |
3 |
Neutral |
Proposal |
30,000 |
50 |
L003 |
Emily White |
Direct/None |
8 |
Negative |
Discovery |
10,000 |
30 |
Jane Doe (L001) originated from organic search and showed high engagement (5 pages viewed). Positive Gong sentiment and advanced deal stage suggest high likelihood of closure.
John Smith (L002) came via LinkedIn ads but engagement metrics and deal progression are moderate. Risk identified in Gong suggests follow-up required.
Emily White (L003) shows high web engagement but a negative Gong sentiment and early stage suggest a risk of churn. Action: Address objections during discovery.
The sales and marketing data insights derived are often passed down to the sellers as dashboards and reports. More often, these insights are static and miss the seller's intuitions.
Databricks AI/BI Genie enables sellers to interact with their data using natural language. It also facilitates interactive data exploration, allowing the sellers to delve into metrics and discover more profound insights. AI/BI Genie capabilities are also available as APIs, allowing integration into LLMs for building Agents. Below are a few examples of how a seller can interact with the lead attribution dataset in a natural language with Databricks Genie.
Unlike Salesforce Einstein AI, Databricks Mosaic AI offers an array of capabilities, including model development, serving, inference, evaluation, monitoring, and observability, and comprehensively analyzes diverse data sources such as web analytics, social media, and third-party data. Databricks can amalgamate these insights with Salesforce data, providing a more holistic view of customer behavior than what Salesforce Einstein can offer. Furthermore, Databricks ML/AI/GenAI models can be exposed as a service within Einstein AI (BYOM), enabling Salesforce applications to develop enriched experiences based on Databricks Mosaic AI models. Databricks is the sole AI platform that unifies governance for all machine learning assets—from data and features to models—into a single catalog. This ensures complete visibility and meticulous control throughout the AI workflow. The platform’s integration of data and AI allows for automatic lineage tracking, centralized governance, collaboration, and monitoring capabilities to identify anomalies within all data and AI workflows, reducing time to value and operational costs.
Here’s an example of leveraging Salesforce, Gong, and Google Analytics to train a machine learning model that predicts the likelihood of successfully closing a deal. At a high-level
The model incorporates various features from different sources. From Salesforce, we use the Lead Score, which indicates lead quality, the current Stage in the sales pipeline (e.g., Discovery or Proposal), the Opportunity Amount in monetary terms, and the Close Date, which is the expected timeframe for closing the deal. Gong features include Call Sentiment, a classification of sentiment as Positive, Neutral, or Negative; Call Duration, the total time spent on calls related to the deal; Keywords, the presence of specific terms encoded as binary or frequency count; and a Risk Indicator that flags stalled deals or low engagement. Google Analytics contributes additional features, such as Source/Medium, which captures the origin of traffic (e.g., Organic or Paid), Pages Viewed, the number of pages visited by the lead, Session Duration, the total time spent on the website, and Conversion Actions, indicating high-value actions taken by the lead, such as form submissions.
The model's target variable is Deal Closure, which is a binary outcome: 1 for closed deals and 0 for lost deals.
The workflow consists of several key steps. First, in the Data Preprocessing phase, datasets are joined using unique identifiers like Lead ID or Opportunity ID, and missing values are handled appropriately, including encoding categorical features into numerical values. In the Feature Engineering stage, a composite risk score is created using information from Gong and Salesforce, and website behavior metrics are aggregated for analysis.
Next, the Model Selection process involves training a classification model, such as Random Forest or XGBoost, and splitting the data into training and testing sets for evaluation. During Model Evaluation, performance metrics like accuracy and F1-score are utilized to assess the model's effectiveness, alongside hyperparameter tuning for optimal results.
Example Dataset for Training
Lead ID |
Lead Score |
Stage |
Amount |
Sentiment |
Pages Viewed |
Session Duration (mins) |
Deal Closed |
L001 |
85 |
Negotiation |
50,000 |
Positive |
5 |
10 |
1 |
L002 |
40 |
Proposal |
30,000 |
Neutral |
3 |
7 |
0 |
L003 |
70 |
Discovery |
10,000 |
Negative |
8 |
20 |
0 |
This approach offers several insights and benefits. It helps identify high-potential deals, allowing resources to be focused on those with a higher likelihood of closure. Proactive risk management is enabled by flagging at-risk deals early, facilitating targeted interventions. Additionally, accurate sales forecasting can be achieved based on predicted closure probabilities, and the refinement of lead scoring systems can enhance accuracy in Salesforce.
Overall, this use case allows sales and marketing teams to leverage data-driven insights, leading to improved win rates and operational efficiency.
Let's see how sellers and marketers can monetize their lead attribution dataset along with the proprietary ML model they created. Sales and marketing agencies can offer the lead attribution dataset to other sales platforms and consultants as a subscription with Databricks Marketplace. Seller organizations can provide anonymized and aggregated data benchmarks (e.g., lead conversion rates by source, average engagement metrics) to clients to compare their performance against industry standards. This can be monetized through one-time reports or recurring subscriptions.
Sales and marketing teams can use Databricks Marketplace to access datasets, AI, and analytical assets like ML models and notebooks without being tied to specific platforms, dealing with complex ETL processes, or incurring high replication costs. This open approach enables faster data utilization across different cloud platforms using preferred tools.
Sellers rely on market research data to expand into new market segments and sales territories. A modern seller experience platform must collaborate with external datasets securely and in compliance with regulations without exposing any first-party data or customer information. Sometimes, a vendor may be unwilling to share market research datasets with sellers unless the environment is secure and ensures data privacy. Vendors may be on different clouds, in different regions, or maybe on different platforms. Databricks Clean Rooms enable businesses to easily collaborate with their customers and partners in a secure environment on any cloud, ensuring privacy. Sellers can securely share and analyze data with partners or other stakeholders without exposing sensitive information, enabling them to gain insights into customer behavior and optimize sales strategies.
This blog outlines how Databricks and Salesforce platforms can be combined to power a modern seller experience platform. While Salesforce is still the best platform to interface with the seller, the powerful integration with Databricks can expand the value of the CRM data to help drive intelligent analytics, equip the seller with a holistic set of data points, provide meaningful insights using sales and marketing data, and ultimately empower sellers with AI tools such as sales assist agent, bot applications, and many more. Moreover, since the sales and marketing data is already spread beyond Salesforce and in systems like Google Analytics, Gong, Seismic, it's critical to aggregate this data to construct a comprehensive view for the sellers. Finally, data and AI landscaping are evolving faster than ever, and organizations must adopt a lakehouse architecture to unify, scale, and govern their sales and marketing datasets while avoiding vendor lock-in.
With Generative AI evolving rapidly, an open architecture like Mosaic AI allows organizations to control the cost and ownership of models. Mosaic AI is a unified platform for creating classic machine learning, AI, and Generative AI applications. The Databricks Mosaic AI platform enables teams to build and collaborate on compound AI systems from a single platform with centralized governance and a unified interface for training, tracking, evaluating, swapping, and deploying. Organizations can transition from general intelligence to data intelligence by utilizing enterprise sales and marketing data.
Seller experience platforms can accelerate innovation through strategic collaboration with suppliers, partners, and vendors. Databricks Clean Rooms facilitates this collaborative process by providing a secure environment for sharing data, models, notebooks, and dashboards. It empowers businesses to seamlessly engage in secure collaboration with their customers and partners across any cloud infrastructure, ensuring privacy and data security.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.