cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Connect to azure data lake storage using databricks free edition

TalessRocha
New Contributor II

Hello guys, i'm using databricks free edition (serverless) and i am trying to connect to a azure data lake storage.

The problem I'm having is that in the free edition we can't configure the cluster so I tried to make the connection via notebook using spark.conf.set but this configuration is not enabled in the free edition... and trying through the unity catalog interface only appears the option to add AWS and Cloudfire credentials. Is there any other way?

I have tried use dbutils.fs.mount as well, but not available on free edition...

1 ACCEPTED SOLUTION

Accepted Solutions

BS_THE_ANALYST
Esteemed Contributor II

@TalessRocha, if you were trying to connect to ADLS on your local machine, using python (for instance), you'd probably install the appropriate python packages to authenticate and then retrieve the containers & blobs/files. I don't see why we can't employ the same logic with the Free Edition.

Here's official documentation on doing that: https://learn.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-directory-file-acl-python?ta... 

Pair that with a youtube video/tutorial and some AI assistance.

I'd try locally using Python & then try on the Free Edition. 

A word of warning, don't connect to confidential information and bring it into the Databricks Free Edition. You should read the terms of service as to why. It goes without saying, it's not your storage/compute under the hood, right. 

Only potential blocker here, in my opinion, is being unable to pip install the appropriate libraries on the free edition. Maybe that's worth trying before getting stuck into the weeds.

Let me know how you end up resolving this, I'm interested. 

All the best,
BS

View solution in original post

9 REPLIES 9

Khaja_Zaffer
Contributor III

hello @TalessRocha 

In Free Edition dbfs is disabled. You should use Unity Catalog for that purpose anyway. DBFS is depracated pattern of interacting with storage.

So, to use volume perform following steps:

Go to Catalgos (1) -> Click workspace catalog (2) -> Click default schema -> Clikc Create button (3)

On the Create button (3) you will have an option to create volume. Pick a name and then create volume.

Khaja_Zaffer_0-1754729527108.png

reference : szymon_dybczak

 

 

Khaja_Zaffer_1-1754729527145.png

 

 

 

If you did that, your new volume should appear in Unity Catalog under default schema. Now you will have an option to upload file to Volume:

 

Khaja_Zaffer_2-1754729527157.png

 

 

 

And here's an example of how to read csv from volume into dataframe:

 

Khaja_Zaffer_3-1754729527126.png

 

 

BS_THE_ANALYST
Esteemed Contributor II

@TalessRocha, if you were trying to connect to ADLS on your local machine, using python (for instance), you'd probably install the appropriate python packages to authenticate and then retrieve the containers & blobs/files. I don't see why we can't employ the same logic with the Free Edition.

Here's official documentation on doing that: https://learn.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-directory-file-acl-python?ta... 

Pair that with a youtube video/tutorial and some AI assistance.

I'd try locally using Python & then try on the Free Edition. 

A word of warning, don't connect to confidential information and bring it into the Databricks Free Edition. You should read the terms of service as to why. It goes without saying, it's not your storage/compute under the hood, right. 

Only potential blocker here, in my opinion, is being unable to pip install the appropriate libraries on the free edition. Maybe that's worth trying before getting stuck into the weeds.

Let me know how you end up resolving this, I'm interested. 

All the best,
BS

Thank you, I was able to do everything using the packages for the Azure Data Lake Storage and Azure Identity client libraries on free edition!

BS_THE_ANALYST
Esteemed Contributor II

@Khaja_Zaffer I don't think that was what @TalessRocha was looking for. I think it's more around connecting to blob storage in Azure in the free edition.

That's a great answer for how to import data into the free edition though!

All the best,
BS

sumitPanda
New Contributor II

I would always use the UC way. That is the standard for the Production workloads. 

Sumit Panda

BS_THE_ANALYST
Esteemed Contributor II

@sumitPanda, is this supported in Free Edition Databricks? i.e. connecting to ADLS via UC. I think that's part of the constraints at play here. 

Completely agree, in production, UC is the way. In fact, we'd normally not need to connect to ADLS if we're using Azure Databricks (and not using external tables). It does depend on the use case, of course. 

All the best,
BS

Yes, thats the limitation. I checked through the documentation and its not mentioned anywhere strictly.

here is the details on how to connect : Connect to Azure Data Lake Storage and Blob Storage - Azure ...

Sumit Panda

Khaja_Zaffer
Contributor III

The above information for dbfs related stuff : I took it from szymon_dybczak(thank you) 

sure next time I will read the things carefully and answer @BS_THE_ANALYST 

BS_THE_ANALYST
Esteemed Contributor II

@TalessRocha thanks for getting back to us! 

Glad to hear you got it working, that's awesome. Best of luck with your projects.

All the best,
BS