โ11-22-2024 06:48 AM - edited โ11-22-2024 06:57 AM
Hi!
Any time I try to import "dlt" in a notebook session to develop Pipelines, I get an error message saying DLT is not supported on Spark Connect clusters. These are very generic clusters, I've tried runtime 14, 15 and the latest 16, using shared clusters as recommended.
How are we supposed to use DLT?
โ11-22-2024 07:05 AM
Hey Oakhill, are you trying to create a DLT pipeline outside of the Databricks environment? Provide more details of your setup. Thanks, Louis.
โ11-22-2024 07:08 AM
โ11-22-2024 07:23 AM
Delta Live tables does not support shared clusters. You need to configure the cluster (a jobs cluster) via the Delta Live Tables use (under workflows). You would still write your notebook with DLT using Python but the pipeline itself has to be configured via the UI and run from the UI.
โ11-22-2024 11:38 AM - edited โ11-22-2024 11:43 AM
I am aware that I have to specify a job cluster/pipeline when deploying. But I am asking how to actually develop on a newer DBR. Back in 12 and 13 we would share a cheap development cluster and then push to prod with autoscale and larger clusters.
I just booted up a cluster on DBR 13 and it worked. Why is that? It's an old version and doesn't support Liquid Clustering, for example.
EDIT: OK, so I had to create a user isolated cluster on DBR 15, unlike the shared on DBR 13.That's unfortunate that we have to make individual clusters for DLT.
โ11-22-2024 07:31 AM
Hey Oakhill,
Are trying to executed the notebook clicking the run button? If is the case, it not gonna work. You need to write the all the code first. After that You need to go in the sidebar menu click in workflow, choose the table Delta Live table and click in the blue button in the right corner to create a pipeline.
Then you need to setup your pipeline with information about you cluster, destination, the path of your notebook. After that , click in create, and your pipeline will be create. Then click in the start button to start you DLT pipeline.
โ11-22-2024 11:38 AM
No, this is when I try to develop on a DBR above 13. I can't even import the dlt-package when developing ๐
a month ago
Oakhill. When building a DLT pipeline you are not prompted for DBR when defining the cluster. DLT handles the infrastructure rollout for you. Can you please share a screenshot of the error you are recieving, I want to see where you are in the platform when you get the error.
3 weeks ago
Hi BigRoux, thanks for the patience.
When developing, I use a standard notebook and attach it to a running cluster. This is not when I _deploy_ the DLT.
This is when the error occurs.
Surely we are not supposed to develop when running the actual pipeline using the validate button? Do you normally develop on an active pipeline?
I have added a screenshot.
โ11-22-2024 07:39 AM
Oakhill, we do provide free onboard training. You might be interested in the "Get Started with Data Engineering on Databricks" session. You can register here: https://www.databricks.com/training/catalog. When you are searching the catalog of training be sure to filter on "Type" for "Get Started." Cheers, Louis.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group