When integrating Salesforce with Databricks to push data upon record creation, using a serving endpoint is not the most common or optimal approach. Although Databricks Feature Serving endpoints can be used for model or feature APIs, they are primarily designed for real-time inference or feature retrieval, not as general REST ingestion endpoints for Salesforce-originated data.โ
The Feature Spec Function youโre missing only appears when using the Databricks Feature Engineering client (databricks-feature-engineering), where you explicitly define a FeatureSpec using Unity Catalog. If youโre not performing real-time feature lookups, you donโt need this setup.โ
Recommended Integration Approaches
1. Salesforce โ Databricks via API Gateway or Middleware
The most reliable approach is to create a proxy API layer between Salesforce and Databricks rather than exposing Databricks directly.
You can:
-
Create a small Express.js or Flask API that Salesforce calls after a record is created.
-
The middleware forwards data to Databricks via the Databricks REST API or a Delta Live Table ingestion job.โ
-
This approach adds resiliency, logging, and retry mechanisms.
2. Salesforce Data Cloud โ Databricks (Zero Copy Integration)
If you use Salesforce Data Cloud, use the new Zero Copy Data Sharing integration:
-
Allows bi-directional data sync between Salesforce and Databricks.
-
Eliminates ETL complexity and avoids REST API maintenance.
-
Supports direct access via Iceberg tables without duplication.โ
3. Databricks Lakeflow Connect (ETL-based)
Databricks has Lakeflow Connect for ingesting data from Salesforce directly using secure connectors:
-
Ideal for near-real-time or batch synchronization.
-
Handles authentication, schema mapping, and incremental updates natively.โ
-
Use this if your goal is data movement rather than live transactional events.
4. Event-driven Integration with Salesforce Platform Events
If you must trigger data transfers upon record creation:
-
Use Salesforce Flow or Apex Trigger to send HTTP POST requests to your middleware API.
-
Middleware calls Databricks REST API endpoints (Jobs or Model Serving).
-
Avoid calling Databricks directly from Salesforce to reduce authentication and timeouts issues.โ
Best Practices
-
Avoid exposing Databricks endpoints publicly for direct Salesforce calls.
-
Use service principals and Unity Catalog permissions to secure Databricks endpoints.โ
-
For real-time model inference or enrichment, use Feature Serving; for ingesting raw transactional data, use ETL or middleware orchestration.
-
Validate integration through monitoring, retry logic, and API throttling.
In summary, creating a Databricks serving endpoint is not the best way for general data ingestion from Salesforce. The recommended setup is a middleware or Lakeflow Connect integration, with Feature Serving endpoints reserved for machine learning applications.