Dive into a collaborative space where members like YOU can exchange knowledge, tips, and best practices. Join the conversation today and unlock a wealth of collective wisdom to enhance your experience and drive success.
For a long time, one of the hardest questions in lakehouse architecture was:How do we let external engines access governed data without bypassing governance?Databricks is making this pattern much cleaner with Unity Catalog external access.The idea is...
Overview
Prompted by a customer question, I wanted to see what was possible in terms of MCP integration into Genie Code, in order to try this out I decided to look at Azure Dev Ops, as it's a common workflow to want to see your tickets alongside the ...
Azure DevOps now has a remote MCP server. This would be much easier to use than creating a function for individual ADO API endpoints as you described above. How can I configure a connection to this remote MCP from Databricks?I'd like to use EntraID...
I came across a blog post comparing Databricks and Google BigQuery for AI-ready data teams. The workload angle stood out.That feels like a useful way to frame the discussion here in the Databricks Community. A lot of platform questions come back to t...
Evaluating pure analytics capabilities is an outdated framework that treats the data warehouse as an isolated silo. Databricks is aggressively moving to handle the entire enterprise footprint including BI & Agentic universe. With the maturity of Data...
Hi Databricks Community,For the DAIS 2026 Community Virtual Contest, I built a project called Grocery Data Intelligence. This is a Smart Grocery Planning solution with Data + AI, built using Databricks Free Edition.The idea came from a very simple re...
One of the more frustrating things when working with materialized views in Databricks was checking whether a view had refreshed incrementally. One way to verify it was by checking the event log, but that required running the pipeline and executing a ...
Hi. This is very helpful. Any idea whether incremental refresh ability is also true for non-algebraic functions like median etc. I was looking for a solution which will work for late arriving data and came across this. I also could not find any docum...
If your CI/CD pipelines suddenly started failing out of nowhere with this error:"error downloading Terraform: unable to verify checksums signature: openpgp: key expired"and you’re using Databricks CLI - you’re probably hitting the same issue I did.Th...
The Real Problem: Kafka Source Parallelism in SparkBefore discussing foreachBatch, multi-table writes, or any specific use case, it helps to understand the underlying issue. This is a problem with how Spark Structured Streaming consumes from Kafka, a...
Most organizations don’t have a data problem anymore.They have a data access and usability problem.The dashboards exist. The warehouses are modernized. The lakehouse is running. Yet business teams still wait days for answers because analytics remains...
Enterprise AI becomes difficult to govern as useful projects accumulate. A machine learning team ships a forecasting model. A data engineering team automates pipeline refreshes. Another group connects a generative AI assistant to internal documentati...
I'm building a live cost estimator that doesn't have to wait for the system tables or billing data to update. It gives me immediate cost feedback every second and I'm sharing the development journey on YouTube.I already have live costs estimates for ...
Hi everyone,I recently took a look into a silent cost driver in many data platforms: the default choice between managed and external tables in Unity Catalog.It is very common for teams to default to external tables, but this choice often leads to acc...
For years, there was no simple way to disable a single task in a Databricks workflow.
Let that sink in
If you wanted to skip a task, you had to get creative
- Add custom flags
- Wrap logic in if/else blocks
- Or build your own workaround just to not...
What Is ABAC and Why Does It Matter?Attribute-Based Access Control (ABAC) is a data governance model now available in Databricks, designed to offer fine-grained, dynamic, and scalable access control for data, AI assets, and files managed through Data...
I wanted to check if ABAC (Attribute-Based Access Control) policies can be applied to metric views in Databricks.I have successfully applied ABAC policies on a fact table, and they are working as expected. However, when I query a metric view that use...
DataHacks 2026: University Alliance in Action at UCSD
How a single weekend of hands on exposure creates the next generation of Databricks advocates
Workshop Lead: Anjana Sriram
Why University Alliances Matters in the Field
Early in my career, the ...