When integrating Salesforce with Databricks to push data upon record creation, using a serving endpoint is not the most common or optimal approach. Although Databricks Feature Serving endpoints can be used for model or feature APIs, they are primarily designed for real-time inference or feature retrieval, not as general REST ingestion endpoints for Salesforce-originated data.
The Feature Spec Function you’re missing only appears when using the Databricks Feature Engineering client (databricks-feature-engineering), where you explicitly define a FeatureSpec using Unity Catalog. If you’re not performing real-time feature lookups, you don’t need this setup.
Recommended Integration Approaches
1. Salesforce → Databricks via API Gateway or Middleware
The most reliable approach is to create a proxy API layer between Salesforce and Databricks rather than exposing Databricks directly.
You can:
-
Create a small Express.js or Flask API that Salesforce calls after a record is created.
-
The middleware forwards data to Databricks via the Databricks REST API or a Delta Live Table ingestion job.
-
This approach adds resiliency, logging, and retry mechanisms.
2. Salesforce Data Cloud → Databricks (Zero Copy Integration)
If you use Salesforce Data Cloud, use the new Zero Copy Data Sharing integration:
-
Allows bi-directional data sync between Salesforce and Databricks.
-
Eliminates ETL complexity and avoids REST API maintenance.
-
Supports direct access via Iceberg tables without duplication.
3. Databricks Lakeflow Connect (ETL-based)
Databricks has Lakeflow Connect for ingesting data from Salesforce directly using secure connectors:
-
Ideal for near-real-time or batch synchronization.
-
Handles authentication, schema mapping, and incremental updates natively.
-
Use this if your goal is data movement rather than live transactional events.
4. Event-driven Integration with Salesforce Platform Events
If you must trigger data transfers upon record creation:
-
Use Salesforce Flow or Apex Trigger to send HTTP POST requests to your middleware API.
-
Middleware calls Databricks REST API endpoints (Jobs or Model Serving).
-
Avoid calling Databricks directly from Salesforce to reduce authentication and timeouts issues.
Best Practices
-
Avoid exposing Databricks endpoints publicly for direct Salesforce calls.
-
Use service principals and Unity Catalog permissions to secure Databricks endpoints.
-
For real-time model inference or enrichment, use Feature Serving; for ingesting raw transactional data, use ETL or middleware orchestration.
-
Validate integration through monitoring, retry logic, and API throttling.
In summary, creating a Databricks serving endpoint is not the best way for general data ingestion from Salesforce. The recommended setup is a middleware or Lakeflow Connect integration, with Feature Serving endpoints reserved for machine learning applications.