- 830 Views
- 4 replies
- 4 kudos
Resolved! Designing a Cost-Efficient Databricks Lakehouse, Performance Tuning and Optimization Best Practices
The Hidden Cost of Scaling the LakehouseOver the past few years, many organizations have successfully migrated to Databricks to modernize their data platforms. The Lakehouse architecture has enabled them to unify data engineering, analytics, and AI o...
- 830 Views
- 4 replies
- 4 kudos
- 4 kudos
@Saurabh2406 this is such a rich article and has so many practical takeaways! Congrats!I faced similar challenges in one of my last projects, and I could spend some time building a nice dashboard (using the system.billing tables) that helped us trac...
- 4 kudos
- 338 Views
- 3 replies
- 6 kudos
🇪🇸 Por qué el DataFrame es el objeto de datos más importante en el procesamiento distribuido
En este video, creado como recordatorio para mi mala memoria a largo plazo, explico de forma sencilla: Qué es un DataFrame Cómo se distribuye en particiones Cómo se ejecuta en un cluster (driver y workers) Qué ocurre en un shuffle Relación entre...
- 338 Views
- 3 replies
- 6 kudos
- 6 kudos
Recently, I am creating some "self-reminder" videos for helping my long-term poor human memory and maybe to help others. Understand internals of Dataframes, how partitions are related to jobs, stages, shuffles and tasks and, how transformations or a...
- 6 kudos
- 219 Views
- 0 replies
- 0 kudos
🚀 LDP Tax Pipeline — Spark Declarative Pipelines on macOS (Without Databricks)
LDP Tax Pipeline — Spark Declarative Pipelines on macOS (Without Databricks)Excited to share my latest hands-on implementation of a LakeFlow Declarative Pipeline (LDP) built locally using Apache Spark 4.1 Declarative Pipelines — running entirely on ...
- 219 Views
- 0 replies
- 0 kudos
- 227 Views
- 0 replies
- 1 kudos
Scaling SCD on Databricks: Then vs Now
Between 2019 and 2021, we built a large-scale lakehouse on Databricks supporting multi-market payments processing (7B+ transactions/year).If ingestion was complex (covered in Part 1), the Silver layer was even more interesting.Implementing SCD Type 1...
- 227 Views
- 0 replies
- 1 kudos
- 275 Views
- 0 replies
- 2 kudos
Is Zerobus the Future of Ingestion on Databricks? Lessons from a 7B+ Transaction Platform
Between 2019 and 2021, we built a multi-market payments data platform on Databricks that now processes more than 7 billion transactions per year across seven markets.Ingestion was by far the most operationally complex layer.To support MongoDB CDC str...
- 275 Views
- 0 replies
- 2 kudos
- 210 Views
- 0 replies
- 0 kudos
What Championship Teams Teach Us About Modern Data Architecture.
High-performing data organizations succeed when all systems, teams, and processes are aligned toward a shared strategy. Fragmentation — separate tools for storage, governance, analytics, and AI, siloed ownership, redundant pipelines, or inconsistent ...
- 210 Views
- 0 replies
- 0 kudos
- 330 Views
- 0 replies
- 1 kudos
Lakebridge: A Developer’s Perspective on ETL Migrations
One of the recent additions to the Databricks ecosystem that caught my attention is Lakebridge, a migration accelerator aimed at legacy ETL and data warehouse workloads.Migration projects are always interesting to discuss because, in practice, they a...
- 330 Views
- 0 replies
- 1 kudos
- 852 Views
- 0 replies
- 3 kudos
Level up your AI Assistant Expeience with Agent Skills
Did you know the Databricks Assistant now supports Agent Skills? If your team has common, repeatable workflows, this feature is one you need to explore. Skills provide the Databricks Assistant with a specific set of instructions to handle tasks custo...
- 852 Views
- 0 replies
- 3 kudos
- 694 Views
- 0 replies
- 2 kudos
Databricks Metric Views - Semantic Layer is moving to where it belongs
A Key challenge for Organizations is to ensure that data metrics refer to the same for all teams. If BI logic is scattered across various tools, SQL and notebooks, a metrics tax is levied (multiple dashboards showing different revenues). Databricks M...
- 694 Views
- 0 replies
- 2 kudos
- 709 Views
- 2 replies
- 1 kudos
How I Reduced Databricks Costs in a High-Volume Financial Data Platform
In financial services, data never sleeps. Trades flow in every second. Risk calculations refresh continuously. Regulatory reports demand precision. BI dashboards serve business users who expect sub-second responses. And behind all of that? A massive ...
- 709 Views
- 2 replies
- 1 kudos
- 1 kudos
@Nidhi_Patni Thanks for production level information!@wesleyfelipe > I have one question: what was the trade-off between using traditional partitioning with Z-ordering versus liquid clustering?Traditional partitioning with Z-ordering is kind of the o...
- 1 kudos
- 416 Views
- 2 replies
- 1 kudos
Resolved! Custom asset bundles file name
Hi,Is there a way to custom name an assetbundle file name and pass that to databricks bundle deploy?I mean right now I must use databricks.yml, so my question is whether there is a way to pass a custom file name.note that I don't want to embed a file...
- 416 Views
- 2 replies
- 1 kudos
- 1 kudos
Hi @murtadha_s The simple answer is no, but I’d like to understand the issue you’re facing so I can see if there’s anything I can help with.For example, this is how I’m using it in my production application, and it’s working quite well. We’re handlin...
- 1 kudos
- 383 Views
- 0 replies
- 3 kudos
Understanding Delta Table Partition Size Distribution Using the Delta Log
When working with externally managed Delta tables and traditional partitioning strategies (for example by day, week, or month), one common challenge is:How large are my partitions actually?Before deciding whether to partition by day vs. week vs. mont...
- 383 Views
- 0 replies
- 3 kudos
- 244 Views
- 0 replies
- 1 kudos
Why Replacing Developers with AI Failed: How Databricks Can Help?
AI didn’t fail to replace developers, It exposed something deeper.In the rush to adopt AI, many organizations assumed that coding assistants and automation would reduce engineering effort and accelerate delivery. But the reality has been different. M...
- 244 Views
- 0 replies
- 1 kudos
- 1306 Views
- 1 replies
- 1 kudos
Terraform support for AI/BI dashboards
AI/BI dashboards can now be managed through Terraform.Dashboard using serialized_dashboard attribute: data "databricks_sql_warehouse" "starter" { name = "Starter Warehouse" } resource "databricks_dashboard" "dashboard" { display_name = "...
- 1306 Views
- 1 replies
- 1 kudos
- 1 kudos
if the content of `file_path` has changed, terraform detects no changes. It is better if you make it checking md5 of the file to allow resource updates.
- 1 kudos
- 131 Views
- 0 replies
- 2 kudos
Scaling Databricks Pipelines with Templates & ADF Orchestration
In a Databricks project integrating multiple legacy systems, one recurring challenge was maintaining development consistency as pipelines and team size grew.Pipeline divergence tends to emerge quickly:• Different ingestion approaches• Inconsistent tr...
- 131 Views
- 0 replies
- 2 kudos
-
Access Data
1 -
ADF Linked Service
1 -
ADF Pipeline
1 -
Advanced Data Engineering
3 -
agent bricks
1 -
Agentic AI
3 -
AI Agents
3 -
AI Readiness
1 -
Apache spark
3 -
Apache Spark 3.0
2 -
ApacheSpark
1 -
Associate Certification
1 -
Auto-loader
1 -
Automation
1 -
AWSDatabricksCluster
1 -
Azure
1 -
Azure databricks
3 -
Azure Databricks Job
2 -
Azure Delta Lake
2 -
Azure devops integration
1 -
AzureDatabricks
2 -
BI Integrations
1 -
Big data
1 -
Billing and Cost Management
1 -
Blog
1 -
Caching
2 -
CDC
1 -
CICDForDatabricksWorkflows
1 -
Cluster
1 -
Cluster Policies
1 -
Cluster Pools
1 -
Collect
1 -
Community Event
1 -
CommunityArticle
2 -
Cost Optimization Effort
1 -
CostOptimization
1 -
custom compute policy
1 -
CustomLibrary
1 -
Data
1 -
Data Analysis with Databricks
1 -
Data Driven AI Roadmap
1 -
Data Engineering
7 -
Data Governance
1 -
Data Ingestion
1 -
Data Ingestion & connectivity
1 -
Data Mesh
1 -
Data Processing
1 -
Data Quality
1 -
Data warehouse
1 -
databricks
1 -
Databricks App
1 -
Databricks Assistant
2 -
Databricks Community
1 -
Databricks Dashboard
2 -
Databricks Delta Table
1 -
Databricks Demo Center
1 -
Databricks Job
1 -
Databricks Lakehouse
1 -
Databricks Migration
3 -
Databricks Mlflow
1 -
Databricks Notebooks
1 -
Databricks Serverless
1 -
Databricks Support
1 -
Databricks Training
1 -
Databricks Unity Catalog
2 -
Databricks Workflows
1 -
DatabricksML
1 -
DBR Versions
1 -
Declartive Pipelines
1 -
DeepLearning
1 -
Delta Lake
5 -
Delta Live Table
1 -
Delta Live Tables
1 -
Delta Time Travel
1 -
Devops
1 -
DimensionTables
1 -
DLT
2 -
DLT Pipelines
3 -
DLT-Meta
1 -
Dns
1 -
Dynamic
1 -
Free Databricks
3 -
Free Edition
1 -
GenAI agent
2 -
GenAI and LLMs
2 -
GenAIGeneration AI
2 -
Generative AI
1 -
Genie
1 -
Governance
1 -
Governed Tag
1 -
Hive metastore
1 -
Hubert Dudek
42 -
Hybrid Lakehouse
1 -
Lakeflow Pipelines
1 -
Lakehouse
2 -
Lakehouse Migration
1 -
Lazy Evaluation
1 -
Learn Databricks
1 -
Learning
1 -
Library Installation
1 -
Llama
1 -
LLMs
1 -
mcp
2 -
Medallion Architecture
2 -
Metric Views
1 -
Migrations
1 -
MSExcel
3 -
Multi-Table Transactions
1 -
Multiagent
3 -
Networking
2 -
NotMvpArticle
1 -
Partitioning
1 -
Partner
1 -
Performance
2 -
Performance Tuning
2 -
Private Link
1 -
Pyspark
2 -
Pyspark Code
1 -
Pyspark Databricks
1 -
Pytest
1 -
Python
1 -
Reading-excel
2 -
Scala Code
1 -
Scripting
1 -
SDK
1 -
Serverless
2 -
Spark
4 -
Spark Caching
1 -
Spark Performance
1 -
SparkSQL
1 -
SQL
2 -
Sql Scripts
2 -
SQL Serverless
1 -
Students
1 -
Support Ticket
1 -
Sync
1 -
Training
1 -
Tutorial
1 -
Unit Test
1 -
Unity Catalog
7 -
Unity Catlog
1 -
Variant
1 -
Warehousing
1 -
Workflow Jobs
1 -
Workflows
6 -
Zerobus
1
- « Previous
- Next »
| User | Count |
|---|---|
| 85 | |
| 71 | |
| 45 | |
| 44 | |
| 42 |