cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

dbutils.fs.ls versus pathlib.Path

chari
Contributor

Hello community members,

The dbutils.fs.ls('/') exposes the distributed file system(DBFS) on the databricks cluster. Similary, the python library pathlib can also expose 4 files in the cluster like below:

from pathlib import Path

mypath = Path('/')

for item, num in mypath.iterdir():

     print(num, item)

How do they differ and what kind of files do they allow to work with ? 

Thanks

2 REPLIES 2

Wojciech_BUK
Valued Contributor III

I think it will be usefull if you look at this documentation to understand difrent files and how you can interact with them:
https://learn.microsoft.com/en-us/azure/databricks/files/

there is not much to say then that dbutils is "databricks code" that allows you to work with databricks assests more easily rather than writing your own code. 

When you run commands (not spark) in databricks, you basicaly execute it on dirver node, that is linux machine.
Whatever is mounted to this machine, will be avalieble for you from code that can be interpreted in machine.

I know that some of dbutils functions has some performance boost if you want to do some massive mover or remove operation but you can achieve same thing writing your own code from scratch ๐Ÿ™‚ 

if you are very courious what you can find in cluster - create single user cluster, enable web terminal in databricks and launch web terminal. You will be looged as a root to cluster console, where you can explore files.

Hello Wojceich,

Thanks for your time but I like to correct myself a bit. The DBFS (databricks file system) is actually Azure blob storage which is mounted on databricks whereas the libraries are installed on the driver node (as you rightly pointed out!). 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group