Proposed by Databricks in 2020, the Lakehouse architecture has increasingly been embraced by the industry in the past years. Databricks has since applied Generative AI and evolved the Databricks Lakeh...
In this blog, we will look at the migration from AWS Glue Data Catalog to Unity Catalog. We cover how to plan this migration as a step-by-step approach and emphasize meticulous planning, phased migrat...
Almost anyone who uses Spark or Databricks is aware of the Spark UI, and we all know that it’s a super powerful tool in the right hands. It can reveal what’s going wrong and any inefficiencies in you...
Introduction
Cost optimisation remains a pivotal challenge for customers dealing with processing large volumes of data and machine learning model training at scale in the cloud. Spot instances have re...
When setting up compute, there are many options and knobs to tweak and tune, and it can get quite overwhelming very quickly. To help you with optimally configuring your clusters, we have broken dow...
Authors: Andrey Mirskiy (@AndreyMirskiy) and Marco Scagliola (@MarcoScagliola)
Welcome to the fourth part (#4) of our blog series on “Why Databricks SQL Serverless is the best fit for BI workloads”.
I...
Databricks Model Serving provides a scalable, low-latency hosting service for AI models. It supports models ranging from small custom models to best-in-class large language models (LLMs). In this blog...
Unity Catalog (UC) is Databricks unified governance solution for all data and AI assets on the Data Intelligence Platform. UC is central to implementing MLOps on Databricks as it is where all your as...
In the world of data science, there is often a need to optimize or migrate legacy code. In this blog post, we address a common technical challenge faced by many data scientists and engineers - making ...
Inuktitut, the language of the Inuit, has 50 words for snow and ice.
That’s - as they say - fake news, but the point made is metaphorical:
When something is important to a people, their language finds...