- 279 Views
- 1 replies
- 2 kudos
Building an Incremental Customer Data Migration Workflow in Databricks
Building an Incremental Customer Data Migration Workflow in DatabricksBy Naveen AyallaIntroductionIn many enterprise environments, customer data is spread across legacy systems that were originally designed for operational processing rather than mode...
- 279 Views
- 1 replies
- 2 kudos
- 2 kudos
Great write-up, Naveen. Very practical and clear.I really like how you focused not just on migration, but on building a reliable incremental workflow with proper duplicate handling and governance. That’s where real value comes from.Also, connecting D...
- 2 kudos
- 254 Views
- 0 replies
- 1 kudos
From Tableau to Databricks: Migrating KPI Dashboards with Metric Views
How to extract Tableau calculated fields, dimensions, and measures from a .twbx workbook and re-express them as a production-grade Databricks Metric View YAML — with the Sample Superstore dataset as a complete worked example, accelerated by AI coding...
- 254 Views
- 0 replies
- 1 kudos
- 146 Views
- 0 replies
- 0 kudos
Automating Databricks Lakeflow Connect Pipelines for CDC Databases
Hi all,Tired of paying the data movement tax or wrestling with complex manual pipeline configs?I just published a new Medium article and open-sourced a framework that fully automates Databricks Lakeflow Connect pipelines for CDC-enabled databases usi...
- 146 Views
- 0 replies
- 0 kudos
- 156 Views
- 0 replies
- 1 kudos
Operating PostgreSQL CDC on AWS RDS with Lakeflow Connect
Over the years, I have helped organizations design and deliver large-scale data platforms, and one recurring lesson has remained constant: CDC failures are rarely caused by technology alone. They are usually the result of unclear ownership, missing o...
- 156 Views
- 0 replies
- 1 kudos
- 150 Views
- 0 replies
- 1 kudos
DAIS Community Virtual Challenge 2026: Sysl — Scanning Japanese Receipts
The ProblemLiving in Japan means getting handed receipts everywhere — convenience stores, pharmacies, restaurants. Most end up in a pocket or trash, never tracked, and the coupons go unused.The SolutionSysl is a PWA that scans any Japanese receipt au...
- 150 Views
- 0 replies
- 1 kudos
- 358 Views
- 1 replies
- 2 kudos
DAIS Community Virtual Challenge 2026: LEGO Value Engine - using Data and AI to Find the Best LEGO
Hey everyone!For the DAIS 2026 Community Virtual Challenge, I built a LEGO Value Engine using Databricks Free Edition.This is a passion project that combined my interests of both LEGOs and Data Engineering.When a new LEGO set releases, it can be hard...
- 358 Views
- 1 replies
- 2 kudos
- 2 kudos
GitHub Repository: Lego Value Engine Architecture Diagram (better quality version can be found here)
- 2 kudos
- 138 Views
- 0 replies
- 1 kudos
# Retail demand forecasting on Databricks Free Edition
Hi Databricks Community,I built a retail sales forecasting system on Databricks Free Edition using the Rossmann Store Sales dataset — about 1,115 stores with daily sales over two and a half years. The goal was a 48-day forecast, the same horizon as t...
- 138 Views
- 0 replies
- 1 kudos
- 202 Views
- 0 replies
- 2 kudos
Building a Scalable Data Pipeline with Databricks Free edition | Spark Declarative Pipelines
I recently built an end-to-end data pipeline architecture in the transportation domain, focusing on city and trip data. The pipeline follows the Bronze–Silver–Gold layered approach, where raw data is ingested into the Bronze layer, cleaned and standa...
- 202 Views
- 0 replies
- 2 kudos
- 138 Views
- 0 replies
- 0 kudos
Solving the "Untitled" Lineage Mystery in Unity Catalog
Hey everyone,Have you ever opened Databricks Catalog Explorer to audit a table, only to find the downstream job listed as "Untitled"? Databricks Unity Catalog is incredibly powerful for automated lineage, but it quietly breaks the moment you orchestr...
- 138 Views
- 0 replies
- 0 kudos
- 219 Views
- 0 replies
- 0 kudos
Why Your Delta MERGE is 5x Slower Than an Overwrite (And How to Fix It)
Hey everyone,We’ve all been there: a Delta Lake MERGE job that should take 20 minutes drags on for 90 minutes, while a full overwrite of the same table finishes in under 20. When an overwrite outpaces a selective merge, it's a massive red flag that y...
- 219 Views
- 0 replies
- 0 kudos
- 337 Views
- 2 replies
- 4 kudos
How I Passed the Databricks GenAI Engineer Associate — A No-Fluff Study Guide
Hello Everyone, As a Data & Analytics Engineer with experience spanning ETL, data engineering, solution design, and data platform engineering, I currently work Azure Data Ecosystem involving Azure Databricks, Terraform, and CI/CD pipelines — building...
- 337 Views
- 2 replies
- 4 kudos
- 161 Views
- 0 replies
- 0 kudos
Converting stored procedures to PySpark
Hi Everyone,I just publsihed an detailed article in medium for migrating the stored procedures to pyspark #https://medium.com/p/909c5c700ffd?postPublishedType=initial
- 161 Views
- 0 replies
- 0 kudos
- 175 Views
- 0 replies
- 1 kudos
Why We Used Two Bronze Tables Instead of One — And Why It Mattered
Part 1 of a 5-part series on building an enterprise data platform on Databricks.When migrating a large retail conglomerate's SAP HANA platform to Databricks, we needed both historicalcompleteness and near-real-time freshness from day one.That require...
- 175 Views
- 0 replies
- 1 kudos
- 299 Views
- 0 replies
- 0 kudos
Stop giving your Databricks AI Agents 50 tools to manage
If you are building enterprise Generative AI, you know the pain of the "Monolithic Agent Bottleneck": passing dozens of tools to a single AI supervisor leads to hallucinated routing, massive context bloat, and security nightmares.With the recent May ...
- 299 Views
- 0 replies
- 0 kudos
- 228 Views
- 0 replies
- 1 kudos
Power BI - DBX Unity catalog Lineage Syncer
Hi folks, So I have recently, as a side project been working on a tool, Link to my github: https://github.com/JeffDenzel/lineage-syncer.The problemThe reason for it is that DataBricks allows you to put external data assets into your UC. However, ther...
- 228 Views
- 0 replies
- 1 kudos
-
Access Data
1 -
Access Delta Tables
1 -
ADF Linked Service
1 -
ADF Pipeline
1 -
Advanced Data Engineering
5 -
agent bricks
1 -
Agentic AI
3 -
AI
1 -
AI Agents
4 -
AI Readiness
1 -
AIBI
1 -
Analytics Engineering
1 -
Apache spark
3 -
Apache Spark 3.0
2 -
ApacheSpark
1 -
Architecture
1 -
Associate Certification
1 -
Audit
1 -
Auto-loader
1 -
Automation
1 -
AWSDatabricksCluster
2 -
Azure
2 -
Azure databricks
3 -
Azure Databricks Delta Table
1 -
Azure Databricks Job
2 -
Azure Delta Lake
3 -
Azure devops integration
1 -
AzureDatabricks
2 -
BI
1 -
BI Integrations
1 -
Big data
1 -
Billing and Cost Management
2 -
Blog
1 -
Caching
2 -
CDC
1 -
CICD
2 -
CICDForDatabricksWorkflows
1 -
Cluster
1 -
Cluster Policies
1 -
Cluster Pools
1 -
Collect
1 -
Community Event
1 -
CommunityArticle
2 -
Cost Optimization Effort
2 -
CostOptimization
2 -
custom compute policy
1 -
CustomLibrary
1 -
DABs
1 -
DAIS 0206
3 -
Dashboards
2 -
Data
1 -
Data Analysis with Databricks
1 -
Data Architecture
2 -
Data Driven AI Roadmap
1 -
Data Engineering
10 -
Data Governance
2 -
Data Ingestion
1 -
Data Ingestion & connectivity
1 -
data layout
1 -
Data Mesh
1 -
data optimization
1 -
Data Processing
1 -
Data Quality
1 -
Data warehouse
1 -
databricks
1 -
Databricks App
1 -
Databricks Apps
1 -
Databricks Assistant
2 -
Databricks Community
1 -
Databricks Dashboard
2 -
Databricks Delta Table
2 -
Databricks Demo Center
1 -
databricks genie
1 -
Databricks Job
2 -
Databricks Lakeflow
2 -
Databricks Lakehouse
2 -
Databricks Migration
3 -
Databricks Mlflow
1 -
Databricks News
1 -
Databricks Notebooks
1 -
Databricks Pyspark
3 -
Databricks Serverless
1 -
Databricks Support
1 -
Databricks Training
1 -
Databricks Unity Catalog
3 -
Databricks Workflows
3 -
DatabricksML
1 -
DBR Versions
1 -
Declartive Pipelines
1 -
DeepLearning
1 -
Delta Lake
9 -
Delta Live Table
1 -
Delta Live Tables
1 -
Delta Time Travel
1 -
DevOps
2 -
DimensionTables
1 -
DLT
2 -
DLT Pipelines
3 -
DLT-Meta
1 -
Dns
1 -
Dynamic
1 -
ETL Pipelines
1 -
fastapi
1 -
Free Databricks
3 -
Free Edition
1 -
GenAI agent
2 -
GenAI and LLMs
3 -
GenAIGeneration AI
2 -
Generation AI
1 -
Generative AI
1 -
Genie
3 -
Git
1 -
Google Bigquery
1 -
Google cloud
1 -
Governance
1 -
Governed Tag
1 -
hackathon
1 -
Hive metastore
1 -
Hubert Dudek
42 -
Hybrid Lakehouse
1 -
Kafka streaming
2 -
Lakeflow Pipelines
1 -
Lakehouse
2 -
Lakehouse Migration
1 -
Lazy Evaluation
1 -
Learning
1 -
Library Installation
1 -
Lineage
1 -
Live Tables CDC
1 -
Llama
1 -
LLMs
1 -
Machine Learning
1 -
mcp
2 -
Medallion Architecture
3 -
MERGE Performance
1 -
Metadata
1 -
Metric Views
2 -
Microsoft Teams
1 -
Migrations
1 -
MSExcel
3 -
Multi-Table Transactions
1 -
Multiagent
3 -
Networking
2 -
New Features
1 -
NotMvpArticle
1 -
Optimize Command
1 -
Partitioning
1 -
Partner
1 -
Performance
2 -
Performance Tuning
3 -
Powerbi
1 -
Private Link
1 -
Pyspark
4 -
Pyspark Code
1 -
Pyspark Databricks
1 -
Pytest
1 -
Python
1 -
Reading-excel
2 -
SAP
1 -
Sap Hana Driver
1 -
Scala Code
1 -
Scripting
1 -
SDK
1 -
Security
1 -
Semantic Layer
1 -
Serverless
2 -
slack
1 -
Spark
5 -
Spark Caching
1 -
Spark Performance
1 -
SparkSQL
1 -
SQL
2 -
Sql Scripts
2 -
SQL Serverless
1 -
streamlit
1 -
Structured streaming
1 -
Students
2 -
Support Ticket
1 -
Sync
1 -
Training
1 -
Tutorial
3 -
UCSD
1 -
Unit Test
1 -
Unity Catalog
9 -
Unity Catlog
1 -
University Alliance
1 -
VACUUM Command
1 -
Variant
1 -
Warehousing
1 -
Workflow Jobs
1 -
Workflows
8 -
Zerobus
1
- « Previous
- Next »
| User | Count |
|---|---|
| 85 | |
| 74 | |
| 57 | |
| 44 | |
| 44 |