Dive into a collaborative space where members like YOU can exchange knowledge, tips, and best practices. Join the conversation today and unlock a wealth of collective wisdom to enhance your experience and drive success.
Databricks has a new/updated feature in Beta: Databricks Container Services for standard compute.Docs: https://docs.databricks.com/aws/en/compute/custom-containers-standardWith this feature, you can specify a Docker image when creating standard compu...
With Genie Code, you can now add a Tableau or Power BI file and have it build an AI/BI dashboard that replicates your existing visualizations - while connecting them to metric views that mirror the underlying business logic.Import BI files using Geni...
A delivery truck arrives at a downtown Seattle coffee shop carrying 8,000 gallons of oat milk. The regional manager is furious. The AI agent that manages supply chain decisions made the call autonomously at 08:14 AM on a rainy 45F morning. Nobody ord...
Our data team kept hitting the same problem. We needed UIs for business users, dashboards, CRUD apps, internal tools. Streamlit was our go to but it reruns the entire script on every interaction, the interfaces all look identical, and it gets painfu...
Hey everyone,I recently worked on building a modern financial data lakehouse using Spark Declarative Pipeline OSS (SDP OSS), Apache Iceberg, and AWS Glue Catalog.The blog covers:- Building declarative data pipelines with Spark- Using Apache Iceberg a...
Why Your Delta Lake Tables Are Quietly Ballooning (And How to Fix It)If your data pipeline only appends a few gigabytes a day, but your cloud storage footprint is skyrocketing into hundreds of gigabytes, you aren’t alone. We recently watched one of o...
Most Databricks deployment pipelines on Azure still authenticate with a service principal client secret. There is a better way and it does not require managing a single credential. The standard pattern has a quiet problemIf you have set up Databricks...
I have spent the past couple of months building up a small hobby YouTube project where I run through Databricks Features in an easy to digest format.I am for a very hands-on and demo heavy approach so you can actually see what is going on, and how th...
Hey everyone!I've built and open-sourced ar-io-mlfow. This is a plugin that adds cryptographic provenance across the ML lifecycle (training runs, model registration, stage promotions, inference, and datasets).What it doesCreates signed Ed25519 crypto...
In today’s data-driven world, the role of a data engineer is critical in designing and maintaining the infrastructure that allows for the efficient collection, storage, and analysis of large volumes of data. Databricks certifications holds significan...
Hi everyone — wanted to share a three-part series I recently published on Medium that examines architectural patterns from a real Databricks-based data consolidation project.The specific case is a logistics platform unifying two legacy systems into a...
This article wraps up a technical deep dive into building large-scale Lakehouse architectures, revisiting design decisions from a 2019 platform that processed billions of payment records.In the original platform, streaming pipelines ran on Spark Stre...
For a long time, one of the hardest questions in lakehouse architecture was:How do we let external engines access governed data without bypassing governance?Databricks is making this pattern much cleaner with Unity Catalog external access.The idea is...
Overview
Prompted by a customer question, I wanted to see what was possible in terms of MCP integration into Genie Code, in order to try this out I decided to look at Azure Dev Ops, as it's a common workflow to want to see your tickets alongside the ...
Azure DevOps now has a remote MCP server. This would be much easier to use than creating a function for individual ADO API endpoints as you described above. How can I configure a connection to this remote MCP from Databricks?I'd like to use EntraID...
I came across a blog post comparing Databricks and Google BigQuery for AI-ready data teams. The workload angle stood out.That feels like a useful way to frame the discussion here in the Databricks Community. A lot of platform questions come back to t...
Evaluating pure analytics capabilities is an outdated framework that treats the data warehouse as an isolated silo. Databricks is aggressively moving to handle the entire enterprise footprint including BI & Agentic universe. With the maturity of Data...