Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-25-2025 05:10 AM - edited 06-25-2025 05:11 AM
Hi @Pratikmsbsvm ,
The appropriate approach would be:
- Data Ingestion:
- Ingest data from SAP, SAP CAR, and Salesforce using Azure Data Factory or third-party connectors. For near real-time updates, enable CDC-based ingestion.
- Ingest data from SAP, SAP CAR, and Salesforce using Azure Data Factory or third-party connectors. For near real-time updates, enable CDC-based ingestion.
- Data Lakehouse Storage:
- Store all raw data in Azure Data Lake Storage (ADLS) as Delta Lake tables to ensure ACID transactions and reliable data handling.
- Store all raw data in Azure Data Lake Storage (ADLS) as Delta Lake tables to ensure ACID transactions and reliable data handling.
- Analytical Data Handling:
- Use Databricks SQL to power BI dashboards, reports, and analytical workloads on top of your gold layer.
- Use Databricks SQL to power BI dashboards, reports, and analytical workloads on top of your gold layer.
- Data Processing:
- Organize data using the Medallion architecture:
- Bronze - Raw ingested data
- Silver - Cleaned and conformed data
- Gold - Aggregated, business-ready data for reporting and consumption
- Organize data using the Medallion architecture:
- Real-Time Delivery:
- For Spryker’s 15-second real-time requirement, use Databricks Structured Streaming with Azure Event Hubs or Kafka.
- Serve data to consumers like Salesforce, Spryker, and Mad Mobile via APIs or by sharing gold tables through REST endpoints or direct access.
- Error Handling & Monitoring:
- Monitor pipelines using Azure Monitor and Databricks system tables to catch failures or delays early.
- Set up alerts and logging to track job health and ensure data quality across the pipeline.