- 238 Views
- 1 replies
- 0 kudos
I built ar-io-mlflow: Open-Source MLflow Plugin for Verifiable & Tamper-Proof AI Provenance
Hey everyone!I've built and open-sourced ar-io-mlfow. This is a plugin that adds cryptographic provenance across the ML lifecycle (training runs, model registration, stage promotions, inference, and datasets).What it doesCreates signed Ed25519 crypto...
- 238 Views
- 1 replies
- 0 kudos
- 0 kudos
I also opened a discussion on Github, not sure which is the right place sorry if this isn't it - https://github.com/mlflow/mlflow/discussions/23355
- 0 kudos
- 16379 Views
- 13 replies
- 43 kudos
From Associate to Professional: My Learning Plan to ace all Databricks Data Engineer Certifications
In today’s data-driven world, the role of a data engineer is critical in designing and maintaining the infrastructure that allows for the efficient collection, storage, and analysis of large volumes of data. Databricks certifications holds significan...
- 16379 Views
- 13 replies
- 43 kudos
- 43 kudos
Hello , can u hep me with study material for associate level exam, where do we get it? the free version if possible? any noteS?
- 43 kudos
- 163 Views
- 0 replies
- 0 kudos
Why centralized state forces distributed reconstruction — a three-part series
Hi everyone — wanted to share a three-part series I recently published on Medium that examines architectural patterns from a real Databricks-based data consolidation project.The specific case is a logistics platform unifying two legacy systems into a...
- 163 Views
- 0 replies
- 0 kudos
- 208 Views
- 0 replies
- 0 kudos
From Fragmented Schedulers to Unified Orchestration: A Lakehouse Evolution
This article wraps up a technical deep dive into building large-scale Lakehouse architectures, revisiting design decisions from a 2019 platform that processed billions of payment records.In the original platform, streaming pipelines ran on Spark Stre...
- 208 Views
- 0 replies
- 0 kudos
- 266 Views
- 0 replies
- 1 kudos
Access Databricks data using external systems
For a long time, one of the hardest questions in lakehouse architecture was:How do we let external engines access governed data without bypassing governance?Databricks is making this pattern much cleaner with Unity Catalog external access.The idea is...
- 266 Views
- 0 replies
- 1 kudos
- 1364 Views
- 1 replies
- 5 kudos
Create an MCP for Azure DevOps To Use With Genie Code
Overview Prompted by a customer question, I wanted to see what was possible in terms of MCP integration into Genie Code, in order to try this out I decided to look at Azure Dev Ops, as it's a common workflow to want to see your tickets alongside the ...
- 1364 Views
- 1 replies
- 5 kudos
- 5 kudos
Azure DevOps now has a remote MCP server. This would be much easier to use than creating a function for individual ADO API endpoints as you described above. How can I configure a connection to this remote MCP from Databricks?I'd like to use EntraID...
- 5 kudos
- 576 Views
- 1 replies
- 0 kudos
Databricks vs. BigQuery Through a Workload Lens
I came across a blog post comparing Databricks and Google BigQuery for AI-ready data teams. The workload angle stood out.That feels like a useful way to frame the discussion here in the Databricks Community. A lot of platform questions come back to t...
- 576 Views
- 1 replies
- 0 kudos
- 0 kudos
Evaluating pure analytics capabilities is an outdated framework that treats the data warehouse as an isolated silo. Databricks is aggressively moving to handle the entire enterprise footprint including BI & Agentic universe. With the maturity of Data...
- 0 kudos
- 414 Views
- 1 replies
- 2 kudos
Finally! A simple way to validate whether your Materialized View can actually refresh incrementally
One of the more frustrating things when working with materialized views in Databricks was checking whether a view had refreshed incrementally. One way to verify it was by checking the event log, but that required running the pipeline and executing a ...
- 414 Views
- 1 replies
- 2 kudos
- 2 kudos
Hi. This is very helpful. Any idea whether incremental refresh ability is also true for non-algebraic functions like median etc. I was looking for a solution which will work for late arriving data and came across this. I also could not find any docum...
- 2 kudos
- 577 Views
- 0 replies
- 1 kudos
Mitigation for "error downloading Terraform" during bundle deployments
If your CI/CD pipelines suddenly started failing out of nowhere with this error:"error downloading Terraform: unable to verify checksums signature: openpgp: key expired"and you’re using Databricks CLI - you’re probably hitting the same issue I did.Th...
- 577 Views
- 0 replies
- 1 kudos
- 1201 Views
- 1 replies
- 0 kudos
Azure Databricks Exclusive groups
Une nouvelle primitive de permissions pour empêcher le croisement de données entre usages hébergés sur le même workspace Databricks.
- 1201 Views
- 1 replies
- 0 kudos
- 0 kudos
lien Medium :https://medium.com/@kacn12872/azure-databricks-exclusive-groups-garantir-létanchéité-entre-cas-d-usage-sur-la-lakehouse-e340ce28f332
- 0 kudos
- 232 Views
- 0 replies
- 1 kudos
TruProxy - Live Cost Estimator - Clusters
Hi everyone, I'm continuing to build a live cost estimator for Databricks to get immediate cost estimates every second instead of having to wait for the system tables to update. (see Live Cost Estimator - Databricks Community - 156374)I've finished t...
- 232 Views
- 0 replies
- 1 kudos
- 435 Views
- 0 replies
- 1 kudos
Databricks Kafka Multi-Stream Ingestion Architecture: Scaling Beyond Single-Stream Bottlenecks
The Real Problem: Kafka Source Parallelism in SparkBefore discussing foreachBatch, multi-table writes, or any specific use case, it helps to understand the underlying issue. This is a problem with how Spark Structured Streaming consumes from Kafka, a...
- 435 Views
- 0 replies
- 1 kudos
- 1832 Views
- 1 replies
- 2 kudos
How to handle MERGE with Schema Evolution in Delta Lake
How to handle MERGE with Schema Evolution in Delta LakeHi everyone,Schema evolution during MERGE is one of the trickiest parts of building robust Delta Lake pipelines. Databricks actually has a native SQL syntax for this — plus Python API options for...
- 1832 Views
- 1 replies
- 2 kudos
- 2 kudos
Great post. Would also like to consider the following points:Guardrails: schema evolution is powerful — it can also accidentally add garbage columns if upstream sends unexpected fields.Recommendation: validate/allowlist schema changes in higher envir...
- 2 kudos
- 529 Views
- 1 replies
- 2 kudos
How Databricks Genie Turns Collaboration Tools into AI-Powered Intelligence Platforms
Most organizations don’t have a data problem anymore.They have a data access and usability problem.The dashboards exist. The warehouses are modernized. The lakehouse is running. Yet business teams still wait days for answers because analytics remains...
- 529 Views
- 1 replies
- 2 kudos
- 2 kudos
The Dashboard Era Is Ending. Conversation Is Replacing It.
- 2 kudos
- 465 Views
- 0 replies
- 1 kudos
Scaling Enterprise AI with Databricks Without Losing Control
Enterprise AI becomes difficult to govern as useful projects accumulate. A machine learning team ships a forecasting model. A data engineering team automates pipeline refreshes. Another group connects a generative AI assistant to internal documentati...
- 465 Views
- 0 replies
- 1 kudos
-
Access Data
1 -
Access Delta Tables
1 -
ADF Linked Service
1 -
ADF Pipeline
1 -
Advanced Data Engineering
6 -
agent bricks
2 -
Agentic AI
3 -
AI
2 -
AI Agents
5 -
AI Readiness
1 -
AIBI
1 -
Analytics Engineering
1 -
Apache spark
3 -
Apache Spark 3.0
2 -
ApacheSpark
1 -
Architecture
2 -
Associate Certification
1 -
Audit
1 -
Auto-loader
1 -
Automation
1 -
AWSDatabricksCluster
2 -
Azure
3 -
Azure databricks
3 -
Azure Databricks Delta Table
1 -
Azure Databricks Job
2 -
Azure Delta Lake
3 -
Azure devops integration
1 -
Azure Unity Catalog
2 -
AzureDatabricks
2 -
BI
1 -
BI Integrations
1 -
Big data
1 -
Billing and Cost Management
2 -
Blog
1 -
Caching
2 -
CDC
3 -
CDF
1 -
CICD
2 -
CICDForDatabricksWorkflows
1 -
Cluster
1 -
Cluster Policies
1 -
Cluster Pools
1 -
Collect
1 -
Community Event
1 -
CommunityArticle
2 -
Cost Optimization Effort
2 -
CostOptimization
2 -
custom compute policy
1 -
CustomLibrary
1 -
DABs
1 -
DAIS 0206
3 -
DAIS 2026
2 -
Dashboards
2 -
Data
1 -
Data Analysis with Databricks
1 -
Data Architecture
2 -
Data Driven AI Roadmap
1 -
Data Engineering
12 -
Data Governance
3 -
Data Ingestion
2 -
Data Ingestion & connectivity
1 -
data layout
1 -
Data Mesh
1 -
data optimization
1 -
Data Processing
1 -
Data Quality
1 -
Data warehouse
1 -
databricks
2 -
Databricks App
1 -
Databricks Apps
1 -
Databricks Assistant
2 -
Databricks Community
1 -
Databricks Dashboard
2 -
Databricks Delta Table
2 -
Databricks Demo Center
1 -
databricks genie
1 -
Databricks Job
2 -
Databricks Lakeflow
3 -
Databricks Lakehouse
2 -
Databricks Migration
3 -
Databricks Mlflow
1 -
Databricks News
1 -
Databricks Notebooks
1 -
Databricks Pyspark
3 -
Databricks Serverless
1 -
Databricks Support
1 -
Databricks Training
1 -
Databricks Unity Catalog
3 -
Databricks Workflows
3 -
DatabricksAutomation
1 -
DatabricksML
1 -
DatabricksOptimization
1 -
DataEngineering
1 -
DBR Versions
1 -
Declartive Pipelines
1 -
DeepLearning
1 -
Delta Lake
10 -
Delta Live Table
2 -
Delta Live Tables
1 -
Delta Time Travel
1 -
DeltaLake
1 -
DevOps
2 -
DimensionTables
1 -
DLT
2 -
DLT Pipelines
3 -
DLT-Meta
1 -
Dns
1 -
Dynamic
1 -
ETL Pipelines
2 -
fastapi
1 -
Free Databricks
3 -
Free Edition
1 -
GenAI
1 -
GenAI agent
2 -
GenAI and LLMs
4 -
GenAIGeneration AI
2 -
Generation AI
1 -
Generative AI
1 -
Genie
3 -
Git
1 -
Google Bigquery
1 -
Google cloud
1 -
Governance
1 -
Governed Tag
1 -
hackathon
1 -
Hive metastore
1 -
Hubert Dudek
42 -
Hybrid Lakehouse
1 -
Kafka streaming
2 -
LakeBase
2 -
Lakeflow Pipelines
1 -
Lakehouse
2 -
Lakehouse Migration
1 -
Langchain
1 -
LangGraph
1 -
Lazy Evaluation
1 -
Learning
1 -
Library Installation
1 -
Lineage
2 -
LiquidClustering
1 -
Live Tables CDC
1 -
Llama
1 -
LLM
1 -
LLMs
1 -
Machine Learning
1 -
mcp
2 -
Medallion Architecture
3 -
MERGE Performance
2 -
Metadata
1 -
Metric Views
2 -
Microsoft Teams
1 -
Migrations
1 -
MSExcel
3 -
Multi-Table Transactions
1 -
Multiagent
3 -
Networking
2 -
New Features
1 -
NotMvpArticle
1 -
Optimize Command
1 -
Partitioning
2 -
Partner
1 -
Performance
2 -
Performance Tuning
3 -
PII
1 -
Powerbi
1 -
PredictiveOptimization
1 -
Private Link
1 -
Pyspark
5 -
Pyspark Code
1 -
Pyspark Databricks
1 -
Pytest
1 -
Python
1 -
Reading-excel
2 -
Row Level Security
1 -
SAP
2 -
Sap Hana Driver
1 -
Scala Code
1 -
Scd Type 2
1 -
Scripting
1 -
SDK
1 -
Security
1 -
Semantic Layer
1 -
Serverless
2 -
slack
1 -
Spark
5 -
Spark Caching
1 -
Spark Performance
1 -
SparkSQL
1 -
SQL
2 -
Sql Scripts
2 -
SQL Serverless
1 -
streamlit
1 -
Structured streaming
1 -
Students
2 -
Support Ticket
1 -
Sync
1 -
Training
1 -
Tutorial
3 -
UCSD
1 -
Unit Test
1 -
Unity Catalog
10 -
Unity Catlog
1 -
University Alliance
1 -
VACUUM Command
1 -
Variant
1 -
Warehousing
1 -
Workflow Jobs
1 -
Workflows
8 -
Zerobus
1 -
Zordering
1
- « Previous
- Next »
| User | Count |
|---|---|
| 85 | |
| 74 | |
| 59 | |
| 44 | |
| 44 |