Technical Blog

You should know these 4 things about setting up Unity Catalog!

Over the past year we’ve seen huge adoption of Unity Catalog, and we've noticed certain considerations that are key to all successful deployments. This blog focuses on how to set up a key artifact in ...

Databricks SSO with Auth0 using SAML 2.0

With the recent changes to Databricks login in order to increase our customers' security posture, some customers might be scrambling to setup SSO on their Databricks account and workspaces. Many of t...

Foundation Models API Prompting Guide 1: Lifecycle of a Prompt

This is the first part of a guide on writing prompts for models accessible via the Databricks Foundation Model API, such as DBRX and Llama 3. Prompting, whether in the context of interacting with a ch...

Why DBSQL is Best for BI Workloads - Part 5: Query Optimization with Primary Key Constraints

Authors: Andrey Mirskiy (@AndreyMirskiy) and Marco Scagliola (@MarcoScagliola) IntroductionQuery optimization using primary key constraints - OverviewDemo ScenarioPreparationExecuting Sample Query wit...

MLOps Gym - Beginners Guide to Monitoring

Eventually, all software systems will encounter failures, and AI systems are not exempt from this reality. Failures in traditional software systems can stem from various sources, including infrastruct...

Enhancing the new PySpark Custom Data Sources Streaming API: Adding Progress Tracking Capability

A new PySpark Custom Data Sources API was introduced at DAIS 2024 which simplifies the integration with custom data sources in Apache Spark. Imagine seamlessly streaming incremental data from any API ...

Multi-table operations made simple with DiscoverX

Multi-Table Operations Made Simple In the ever-evolving landscape of data science and engineering, the ability to efficiently manage and manipulate data across multiple tables and databases is paramou...

Queries for Cost Attribution using System Tables

Organizations have expressed the need to see trends across their Databricks Accounts and drill down into Workspaces, SKUs, tags, and users. System Tables provide this visibility with little to no setu...

How not to build a demo

When making videos on new features announced, part of the process is researching what the feature is, thinking how best to demo it and then making the demo itself. But here’s the twist: by definition,...

MLOps Gym - IDEs vs. Notebooks for Machine Learning Development

In machine learning, integrated development environments (IDEs) and notebooks play crucial roles in the development and execution of machine learning models. IDEs provide developers with comprehensive...

Databricks Community

Blog Articles