Explore in-depth articles, tutorials, and insights on data analytics and machine learning in the Databricks Technical Blog. Stay updated on industry trends, best practices, and advanced techniques.
Introduction
Fraud Detection in Financial Services
Every second, thousands of payment transactions flow through financial networks — card swipes at checkout, online purchases, mobile payments. Behind ...
Why Filters matter
Databricks AI/BI Dashboards become truly useful when users can slice data by date, segment, status, or region without editing SQL. Filters make this possible by separating what use...
Spark Declarative Pipelines “How-To” Series
Introduction
Lakeflow Spark Declarative Pipelines (SDP) is a framework designed for building scalable, maintainable ETL pipelines. Everyone should want to m...
In Part 1 of this series, we focused on parsing enterprise PDFs, and this blog we build on that work and show how those chunks are combined with structured use-case data to produce consistent, account...
Zerobus Ingest now supports Databricks Variant type via REST API (Beta), enabling schema-free JSON ingestion. No more schema definitions, no more ETL headaches—just send your data and query it.
The Sc...
Zerobus Ingest is now generally available, enabling you to push data to managed tables in your lakehouse. Zerobus Ingest delivers near real-time data ingestion—within as little as five seconds—while s...
The Challenge: From IoT to Insights in Real-Time
Picture this: It's 3 AM. Twenty sailboats are racing through rough Caribbean seas in the middle of a four-day regatta. Each boat has sensors that tran...
In a recent engagement, I partnered with a customer who had successfully productionalized sophisticated AI use cases using Genie Spaces, Multi-Agent Supervisor systems, and managed MCP servers. Their ...
Since 2022, Databricks Engineering has been on a mission to simplify streaming workloads through Project Lightspeed. We’ve democratized stateful processing with features like transformWithState and th...
Teaser
Not sure whether this blog is for you? Ask yourself these three questions. If you can’t confidently answer yes to any of them, keep reading.
Reproducibility: Can I easily reproduce the model I ...