cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Is there some form of enablement required to use Delta Live Tables (DLT)?

tom_shaffner
New Contributor III

I'm trying to use delta live tables, but if I import even the example notebooks I get a warning saying `ModuleNotFoundError: No module named 'dlt'`. If I try and install via pip it attempts to install a deep learning framework of some sort.

I checked the requirements document and don't immediately see a runtime requirement; am I missing something? Is there something else I need to do to use this feature?

1 ACCEPTED SOLUTION

Accepted Solutions

Aashita
Contributor III
Contributor III

Yes, you will get that error when you run the notebook.

Follow the below steps-

  • On the Databricks notebook left panel, select 'Jobs'

Screen Shot 2022-04-27 at 3.44.46 PM

  • Select 'Delta Live Tables'

Screen Shot 2022-04-27 at 3.45.48 PM

  • Select 'Create Pipeline'

Screen Shot 2022-04-27 at 3.46.08 PM

  • Fill in the details- Pipeline name and in Notebook Libraries: Point to your notebook where you have the dlt code.

Screen Shot 2022-04-27 at 3.46.25 PM

  • Click on 'Start' on top right corner
  • This will start the pipeline, populate the tables and give a graphical representation.
  • NOTE: Make sure in your notebook, you attach the cluster

https://docs.databricks.com/data-engineering/delta-live-tables/delta-live-tables-quickstart.html

View solution in original post

6 REPLIES 6

Aashita
Contributor III
Contributor III

Yes, you will get that error when you run the notebook.

Follow the below steps-

  • On the Databricks notebook left panel, select 'Jobs'

Screen Shot 2022-04-27 at 3.44.46 PM

  • Select 'Delta Live Tables'

Screen Shot 2022-04-27 at 3.45.48 PM

  • Select 'Create Pipeline'

Screen Shot 2022-04-27 at 3.46.08 PM

  • Fill in the details- Pipeline name and in Notebook Libraries: Point to your notebook where you have the dlt code.

Screen Shot 2022-04-27 at 3.46.25 PM

  • Click on 'Start' on top right corner
  • This will start the pipeline, populate the tables and give a graphical representation.
  • NOTE: Make sure in your notebook, you attach the cluster

https://docs.databricks.com/data-engineering/delta-live-tables/delta-live-tables-quickstart.html

Thanks a lot for sharing this great example

mangeldfz
New Contributor III

This error is so annoying... Is it going to be fixed or is there any workaround to avoid it?

Well you are not supposed to run the notebook, you just need to create tables Delta Live Tables in the notebook and attach a cluster. After you have done that, you need to go to jobs to start the pipeline. The pipeline gathers resources from the notebook, initializes it, sets up tables and renders the graph. Delta live Tables is of type orchestration.

tom_shaffner
New Contributor III

Got it. That helps, thanks.

That could maybe be clearer in the documentation; it wasn't immediately clear to me that I couldn't run this outside a normal notebook environment. From the documentation it sounded like I could develop that way and then set up the DLT environment only for use.

Insight6
New Contributor II

Here's the solution I came up with... Replace `import dlt` at the top of your first cell with the following:

    try:
      import dlt # When run in a pipeline, this package will exist (no way to import it here)
    except ImportError:
      class dlt: # "Mock" the dlt class so that we can syntax check the rest of our python in the databricks notebook editor
        def table(comment, **options): # Mock the @dlt.table attribute so that it is seen as syntactically valid below
          def _(f):
            pass
          return _; 

Further mocking may be required depending on how many features from the dlt class you use, but you get the gist.

You can "catch" the import error and mock out a dlt class sufficiently that the rest of your code can be checked. This slightly improves the developer experience until you get a chance to actually run it in a pipeline.

As many have noted, the special "dlt" library isn't "available" when running your python code from the databricks notebook editor, only when running it from a pipeline (which means you lose out on being able to easily check your code's syntax before attempting to run it)

You also can't "%pip install" this library, because it isn't a public package, and the "dlt" package out there has nothing to do with Databricks.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.