The Lakeflow Declarative Pipelines framework makes it easy to build cost-effective streaming and batch ETL workflows using a simple and declarative syntax: you define the transformations for your data, and the platform will automatically manage typical data engineering challenges like task orchestration, scaling, monitoring, data quality, and error handling.
This blog post explores how to automate the bulk conversion of pipelines to the serverless tier.
The pipelines can be configured with three classic and two serverless editions that provide different built-in capabilities.
The clearest advantage of the serverless tiers is that you do not need to dedicate time and effort to define which virtual machine is best suited for the workload, the autoscaling properties to handle spiky loads or how to tackle out-of-memory issues. The Databricks platform takes care of all these common data engineering and infrastructure challenges leveraging telemetry and AI.
The serverless tier provides some unique differentiators:
If you want to deep-dive into the recent optimization of serverless Lakeflow Declarative Pipelines, take a look at this Databricks blog.
The adoption of Databricks’ serverless architecture requires some account-level configuration to satisfy common enterprise security best practices (Unity Catalog, Network Connectivity Configuration, Serverless Egress Control). These configurations are typically managed centrally by a core infrastructure or data team, so let’s assume you have the environment already prepared to use the Databricks serverless infrastructure.
Databricks offers several features to help you monitor the cost of serverless compute (doc).
Tagging your resources with proper identifiers is fundamental for cost management and cost attribution. The serverless tier leverages a dedicated tagging mechanism called Budget Policies, which applies tags to any serverless compute activity performed by an identity assigned to the policy, similarly to how classic resource tagging works. We are going to integrate the Budget Policies into the migration process.
The serverless tier for Lakeflow Declarative Pipelines is available in two different flavors:
The configuration of serverless mode is specified in the Workflow that schedules the pipeline (doc); thus, the pipelines converted using the utility will use either the default value or the configuration explicitly set in the associated Workflow.
At the moment, the serverless Standard tier can be enabled from the Preview Portal.
The Lakeflow Declarative Pipelines serverless converter is an accelerator that speeds up migrating existing pipelines from classic to serverless computing, minimizing manual conversion effort.
The converter provides the following capabilities:
Start cloning the GitHub project, move to the project folder serverless_converter, and install the utility in your environment.
The Python databricks-sdk is going to be installed in your preferred Python environment.
git clone https://github.com/databricks-solutions/databricks-blogposts
cd databricks-blogposts/2025-07-fast-track-to-serverless
pip install .
You can easily interact with the converter through the command line interface.
The converter authenticates using an identity recognized by your Databricks account:
You can set the required environment variables or provide the values through the command line to authenticate the converter using the selected identity:
serverless-converter convert [--backup-file FILE_PATH] [--budget-policy-id POLICY_ID] [--skip-budget-policy]
The convert command enables you to migrate all selected pipelines in place while backing up the current configuration.
The converter returns a list of all the workspace pipelines and asks you which objects you want to convert to serverless. If the converter does not list the pipeline you are searching for, ensure that the principal has the correct permissions (Step 2).
The user is then prompted regarding their preferences for the Budget Policies.
The pipeline owner must be granted at least User permission for the selected Budget Policy to use it for the pipeline execution (doc).
The user must input the budget policy ID. All selected pipelines are then easily converted in place to the serverless tier:
A budget policy will be created for each pipeline, replicating the tags set in the classic compute. The user/service principal owner of the pipeline will be granted the appropriate permissions to attach to the new budget policy.
Let’s explore “Application_01” pipeline as an example. The existing pipeline has several tags associated to track the costs associated with the executions:
These tags are replicated in a new budget policy with the same name as the pipeline:
The pipelines are then converted to serverless and associated with their respective budget policies:
As you can see, the tags are preserved, while the compute configuration has been updated:
You can also choose to convert the pipelines without associating any policy by simply providing the flag –skip-budget-policy and executing the convert command.
Databricks System Tables (doc) provide a comprehensive view of pipeline executions, allowing users to easily compare different runs. With the unified monitoring offered by the Databricks Intelligence Platform, you can easily quantify and evaluate the benefits of the conversion to serverless.
During the conversion process, a backup file containing all pipeline configurations is created. By using the rollback command, you can load the original configuration and revert the selected pipelines to the previous version.
To prevent issues with cross-dependencies, the rollback process does not delete the Budget Policies; it simply detaches them from the related pipelines once it restores to the classic tier.
serverless-converter rollback --backup-file FILE_PATH
Transitioning your Lakeflow Declarative Pipelines to the serverless infrastructure can greatly enhance your data processing capabilities by improving performance, reducing latency, and lowering operational costs. Automating bulk conversion removes significant effort and human errors, streamlines pipeline management, and quickly scales your promotion to serverless.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.