Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
Here's the difference a View and Table in the context of a Delta Live Table PIpelineViews are similar to a temporary view in SQL and are an alias for some computation. A view allows you to break a complicated query into smaller or easier-to-understan...
I have two notebooks created for my Delta Live Table pipeline. The first is a utils notebook with functions I will be reusing for other pipelines. The second contains my actual creation of the delta live tables. I added both notebooks to the pipeline...
Hi Dave,You can solve this by putting your utils into a python file and referencing your .py file in the DLT notebook. I provided a template for the python file below:STEP 1: #import functions
from pyspark.sql import SparkSession
import IPython
dbut...
I need to execute a DLT pipeline from a Job, and I would like to know if there is any way of passing a parameter. I know you can have settings in the pipeline that you use in the DLT notebook, but it seems you can only assign values to them when crea...
This seems to be the key to this question:parameterize for dlt My understanding of this is that you can add the parameter either in the DLT settings UI via Advanced Config/Add Configuration, key, value dialog. Or via the corresponding pipeline set...
The first time i run my delta live table pipeline after setup, I get this error on starting it :-------------------------------------org.apache.spark.sql.catalyst.parser.ParseException: Possibly unquoted identifier my-schema-name detected. Please con...
This still errors on internal databricks spark/python code likedeltaTable.history()@shagun wrote:The first time i run my delta live table pipeline after setup, I get this error on starting it :-------------------------------------org.apache.spark.sql...
Hi there, Can you use a %run or dbutils.notebook.run() in a Delta Live Table (DLT) pipeline?When I try, I get the following error: "IllegalArgumentException: requirement failed: To enable notebook workflows, please upgrade your Databricks subscriptio...
Hi all.@Kaniz Fatma thanks for your answer. I am on the premium pricing tier in Azure.After digging around the logs it would seem that you cannot run magic commands in a Delta Live Table pipeline. Therefore, you cannot use %run in a DLT pipeline - w...
We are exploding a map type column into multiple columns based on the keys of the map column. Part of this process is to extract the keys of a map type column called json_map as illustrated in the snippet below. The code executes as expected when run...
Hi @Suteja Kanuri , Thank you for you response and explanation. The code I have shown above is not the exact snippet we are using. Please find the exact snippet below. We are dynamically extracting the keys of the map and then using getitem() to mak...
Hi @Nitya Mehta Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers y...
Hello, I'm pretty new to Databricks in general and Delta Live Tables specifically. My problem statement is that I'd like loop through a set of files and run a notebook that loads the data into some Delta Live Tables. Additionally, I'd like to include...
Failed to launch pipeline cluster 0802-171503-4m02lexd: The operation could not be performed on your account with the following error message: azure_error_code: OperationNotAllowed, azure_error_message: Operation could not be completed as it results ...
Hi there @Aaron LeBato Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you...
Yes. You can specify a "target" database as part of your DLT pipeline configuration to publish results to a target database in the metastore. See - https://docs.databricks.com/data-engineering/delta-live-tables/delta-live-tables-quickstart.html#publi...
DLT Pipeline results are published to the "Storage Location" defined as part of configuring the Pipeline. Ex:- https://docs.databricks.com/_images/dlt-create-notebook-pipeline.pngIf an explicit Storage Location is not specified, the pipeline results ...