Explore in-depth articles, tutorials, and insights on data analytics and machine learning in the Databricks Technical Blog. Stay updated on industry trends, best practices, and advanced techniques.
When running distributed training or batch inference on multi-node GPU clusters with Spark, the GPUs on the Driver node often remain underutilized, resulting in unnecessary waste of GPU resources. The...
Introduction
Enterprise Data Warehouse (EDW) Migration use cases are always complex in nature. Data architects and Data engineers spend the majority of time in analyzing the source data, identifying t...
In the halcyon days of data science’s youth, version control was an oft-overlooked aspect of the work of data science teams, a “nice-to-have” that was perhaps the domain of hobbyists and enthusiasts. ...
So imagine, you are a Data Engineer using Databricks Auto Loader to ingest data from cloud storage, when you suddenly realize that recently ingested data has some problem and you’d like to simply dele...
This post is written by Pascal Vogel, Solutions Architect, and Kiryl Halozhyn, Senior Solutions Architect.
The Databricks Data Intelligence Platform allows your entire organization to use data and AI....
CICD with Databricks Asset Bundles, Workflows and Azure DevOps
In this article you will learn how to set up Databricks Workflows with CI/CD. There are two essential components needed for a comple...
Introduction
Data and AI media will try to convince you of two terribly misguided ideas about AI’s impact on analytics. First of all, there’s the idea that prompt engineers who know just the right phr...
Databricks Unity Catalog revolutionizes data governance for the Data Intelligence Platform, providing enterprises with AI-powered tools to seamlessly govern their structured and unstructured data, ML...
This blog was posted to my feed, and it split readers into two groups. Those who thought ‘meh’ and moved on with their life, and then the serious minority who thought this was the most super duper bet...
LLMs on Databricks are now available to call via LiteLLM. LiteLLM is a library that provides a python client and an OpenAI-compatible proxy for accessing 100+ LLMs with the same input/output formats, ...