Databricks Community

Sujitha · ‎09-10-2025

Data Engineering

Migrate Lakeflow Declarative Pipelines from legacy publishing mode is GA

Lakeflow Declarative Pipelines has a legacy publishing mode that only allows publishing to a single catalog and schema. The default publishing mode enables publishing to multiple catalogs and schemas. 📖 Documentation

Selectively and atomically replace data with Insert replace using and insert replace on is GA

INSERT REPLACE USING replaces rows when the USING columns compare equal under equality. INSERT REPLACE ON replaces rows when they match a user-defined condition.

Microsoft SQL Server connector and ServiceNow are GA

Salesforce Data Cloud File Sharing connector is GA

New table property for Delta lake compression

You can explicitly set the compression codec for a Delta table using the delta.parquet.compression.codec table property. 📖 Documentation

Create external Delta tables from third party clients

You can now create Unity Catalog external tables backed by Delta Lake from external clients and systems, such as Apache Spark.📖 Documentation

Lakeflow Declarative Pipeline improvements

You can change the identity that a pipeline uses to run updates and the owner of tables published by the pipeline. This feature allows you to set a service principal as the run-as identity, which is safer and more reliable than using user accounts for automated workloads. 📖 Documentation

You can now easily create an ETL pipeline in a bundle in the workspace using the new Lakeflow Declarative Pipelines template project.

You can use automatic liquid clustering with CLUSTER BY AUTO and Databricks intelligently chooses clustering keys to optimize query performance.

⚡Lakebase

OLTP Database tab renamed to Lakebase Postgres

Lakebase synced tables supports syncing Apache Iceberg and Foreign tables

You can create synced tables in Snapshot sync mode from Iceberg tables or foreign tables. 📖 Documentation

Data type mapping

For new synced tables: TIMESTAMP types in source tables are mapped to TIMESTAMP WITH TIMEZONE in synced tables.

Budget policy is supported by Lakebase

You can tag a database instance and a synced table with a budget policy to attribute billing usage to specific policies. Additionally, custom tags can be added to a database instance for more granular attribution of compute usage to teams, projects, or cost centers.

Lakebase is enabled by default

The Lakebase: Managed Postgres OLTP Database preview is enabled by default

🪄Serverless

Serverless compute for notebooks, workflows, and Lakeflow Declarative Pipelines is available in the Asia Pacific (Jakarta) region (ap-southeast-3)

Base environments are custom environment specifications for serverless notebooks that define a serverless environment version and a set of dependencies. 📖 Documentation

Serverless GPU supports hyperparameters sweeps and Multinode Workloads and schedule jobs. 📖 Documentation

New features are available on Serverless Compute:

Databricks connect upgraded to 17.0
Scalar Python UDFS support service credentials
PySpark and Spark Connect support the df.mergeInto

🖥️Platform

Notebooks Improvements

Edit Mode in Assistant does multi-cell code refactoring and more. 📖Documentation.

Use the cell execution minimap to track your notebook’s progress at a glance. The minimap appears in the right margin and shows each cell’s execution state. Hover to see cell details, or click to jump directly to a cell.

Notebook autocomplete supports enhanced suggestions for complex data types including structs, maps, and arrays in SQL cells.

Lakeflow job improvements

Jobs that are set to run in continuous mode have the option to retry individual tasks on task failure. 📖Documentation

Power BI Databricks connector supports M2M OAuth

You can authenticate into Power BI Desktop using M2M OAuth. Databricks recommends switching to the new client credentials authentication option. 📖Documentation

Account SCIM 2.0 updates

Databricks has updated the Account SCIM API for identity management as follows:

Calling GET with filter params filter=displayName eq value_without_quotes results in a syntax error. To prevent this error, use quotation marks to wrap the value
Calling GET /api/2.0/accounts/{account_id}/scim/v2/Groups no longer returns members. Instead, iterate through get group details to get membership information
Calling PATCH /api/2.0/accounts/{account_id}/scim/v2/Groups/{id} returns a 204 response instead of a 200 response.

OAuth token federation is GA

Databricks assistant improvements

You can chat with Databricks Assistant on some compute pages. Use the Assistant chat panel to help you create a new compute resource, pool, and policy.

You can tailor how Databricks Assistant responds by adding custom user instructions. Guide the Assistant with preferences, coding conventions, and response guidelines.

Disable legacy features for new workspaces

A new account console setting allows account admins to disable certain legacy features on new workspaces created in their account. 📖Documentation

Serverless Workspaces are available

🤖GenAI & ML

AI Playground is GA

OpenAI GPT OSS models are available on Mosaic AI Model Serving

Mosaic AI Model Serving supports OpenAI's GPT OSS 120B and GPT OSS 20B as Databricks-hosted foundation models. 📖Documentation

Batch inference with GPT OSS models

OpenAI GPT OSS 120B and GPT OSS 20B are optimized for AI Functions, which means you can perform batch inference using these models and AI Functions like ai_query()

External MCP servers are in Beta

You can connect Databricks to external MCP servers. 📖 Documentation

Mosaic AI Vector Search reranker is available

Mosaic AI Vector Search offers reranking to help improve retrieval quality. 📖 Documentation

Token based rate limits are available on AI Gateway

You can configure token-based rate limits on your model serving endpoints.. 📖 Documentation

OpenAI GPT OSS models releases

The Databricks-hosted foundation models, OpenAI GPT OSS 120B and GPT OSS 20B support function and tool calling and provisioned throughput. 📖 Documentation

📝AIBI Genie

SQL expression validation for join relationships
File uploads are accessible only to the users who uploaded them
You can define join relationships locally within a Genie space's knowledge store. This is useful when authors lack permissions to define primary and foreign keys on upstream tables or when the join relationship has specific requirements, such as one-to-many or complex joins
Value dictionaries select the most frequent 1024 values from the first 100k rows, instead of the first 1024 values encountered.

📊AIBI Dashboard

Dashboard external embedding Public Preview and new custom calculations functions
You can define up to 200 custom calculations per dashboard.

🛡️Governance

Governed tags are public preview

You can create governed tags to enforce consistent tagging across data assets such as catalogs, schemas, and tables. Admins define the allowed keys and values and control which users and groups can assign them to objects. This helps standardize metadata for data classification, cost tracking, access control and automation. 📖Documentation

Single node compute on standard access mode is GA

This configuration allows multiple users to share a single-node compute resource with full user isolation. Single-node compute is useful for small jobs or non-distributed workloads.

Column masks retained when replacing a table

If a column in the new table matches a column name from the original table, its existing column mask is retained, even if no mask is specified. This change prevents accidental removal of column-level security policies during table replacement.

Access requests in Unity Catalog

You can enable self-service access requests in Unity Catalog by configuring access request destinations on securable objects.

Users can request access to Unity Catalog objects that they discover. These requests are sent to configured destinations, such as emails, Slack or Microsoft Teams channels, or they can be redirected to an internal access management system.

📖Documentation

Path credential vending

You can use path credential vending to grant short-lived credentials to external locations in your Unity Catalog metastore. Documentation

🔍 Data Warehousing

Default warehouse setting is available in Beta

Support for timestamp without time zone syntax

Support for schema and catalog level default collation

Expanded spatial SQL expressions and GEOMETRY and GEOGRAPHY data types

Databricks Community

What's new in Databricks - August 2025

What's new in Databricks: July 2025

Azure Databricks — Upgrade Managed Workspace to VNet Injected Workspace

Update your UC AWS IAM Roles to include self-assume capabilities by September 20th, 2024