Dive into a collaborative space where members like YOU can exchange knowledge, tips, and best practices. Join the conversation today and unlock a wealth of collective wisdom to enhance your experience and drive success.
Please check https://github.com/rsleedbx/crdb_to_dbx which has the steps and a working notebook. This guide shows how to stream CockroachDB data to Databricks using changefeeds, Azure Blob Storage, Unity Catalog, and Delta Lake. You get one platform...
Introduction“AI First” - But Data Always Comes FirstI have been working in the data space for close to two decades. My journey started as an ETL developer and gradually evolved into roles spanning data engineering, platform design, and solution archi...
@Saurabh2406 , I really appreciate how you grounded the “AI-first” conversation in the reality that data governance, security, and quality are what actually determine whether AI can scale beyond pilots. The tie-in to Gartner’s AI maturity model, and ...
Wait, Did Databricks Just Put Git Inside My Database?
If you've been scratching your head at Lakebase's "branching" feature wondering "am I working with a database or GitHub?"—you're not alone. Let me break down what's actually happening here, becaus...
@AbhaySingh ,
This was a fun read — and a great way to spark discussion about what “Git inside my database” really means in practice.
From what I’m seeing in the product world, Databricks isn’t literally putting Git inside the storage engine of your...
IntroductionCloud-native data platforms like Azure Databricks are powerful because they abstract away infrastructure so you can focus on data engineering, analytics, and ML workloads. However, there are situations where you may run into issues that r...
Databricks has added 2 new feature on its UI. These are small but quite effective for the developer productivity. 1. Paste images into notebooksCopy images from your local file system and paste them into markdown cells in Databricks notebookshttps:/...
IntroductionScaling data pipelines across an organization can be challenging, particularly when data sources, requirements, and transformation rules are always changing. A metadata table-driven framework using LakeFlow Declarative (Formerly DLT) enab...
can you please share the details how this can be implemented using a sample use case in step by step process. Also python code that needs to written in each layer (bronze/silver/gold)
Databricks + Claude Code This guide walks through a practical, end‑to‑end setup: installing Claude Code, wiring it to Anthropic models served from Databricks, and configuring authentication so everything “just works” from your terminal and editor. Yo...
Retirement is planned for Azure in Oct 2026. Completed in other clouds in Oct 2025Data residing in the Hive Metastore is opaque, suffers from low governance and is siloed in legacy technical constructs. The Hive Metastore (HMS) was a technology revol...
Databricks provides built-in AI functions that can be used directly in SQL or notebooks, without managing models or infrastructure.Example:SELECT ticket_id, ai_generate( 'Summarize this support ticket:\n{{text}}', 'databricks-dbrx-instruct', descript...
With AI/BI Dashboards, a best practice is for the creator/owner to 'Schedule' the Dashboard to rerun the underlying datasets when changes have occurred. This ensures the Visualizations are rendered with the freshest data. But users still will questi...
@mark_ott That is a nice idea and quite useful. Quick question: how did you define 'stale' data in your case? So what is the threshold at which your 'conditional equation' color codes the date red? did you somehow link that to the refresh schedule?
One thing becomes very clear when you spend time in the Databricks community: AI is no longer an experiment. It is already part of how real teams build, ship, and operate data systems at scale.For a long time, many organizations treated data engineer...
I just earned my Databricks Certified Generative AI Engineer Associate Certification, and in this post, I’m sharing the key tips, resources, and personal insights that helped me succeed. My certiticate from DatabricksNavigation:PreludeAbout the certR...
Hey @devipriya , thanks for sharing your notes on how you found success with the certification(s). Appreciate you taking the time to pass along what worked for you.
Cheers, Louis
When Databricks One was launched, the default behaviour of the system-managed users group was a major issue. Since every new user is automatically added to users group, and that group traditionally came with "Workspace Access" entitlements, admins ha...
Here is a very short video on how to use collect() method with caution as it could overwhelm the cluster in some big data scenarios!!!! Now only in Spanish but I guess youtube will do the trick to translate into very correct English xDD
The Industrial Data ChallengeManufacturing enterprises today operate with a fundamental paradox: they're drowning in data yet starving for insights. A typical plant generates terabytes of information daily across dozens of systems—from shop floor PLC...