cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

How to check file exists in databricks

SimhadriRaju
New Contributor

I Have a while loop there i have to check a file exists or not if exists read the file in a data frame else go to another file

7 REPLIES 7

User16857282152
Contributor

There might be other ways to do this, but this works.

@Simhadri Raju

Basically,

I use dbutils.fs.head here, but anything that throws an exception if it fails to find the file would work.

So I go to read the first byte of the file with

dbutils.fs.head(arg1,1)

If that throws an exception I return False

If that succeeds I return True.

Put that in a function, call the function with your filename and you are good to go.

Full code here

## Function to check to see if a file exists

def fileExists (arg1): try: dbutils.fs.head(arg1,1) except: return False; else: return True;

Calling that function with your filename

ilename = <pathtoyourfile>

if(fileExists(filename)): print("Yes it exists");

Oh that code did not render well,

zerogjoe
New Contributor II

Here's a Python solution:

def file_exists(path):
  try:
    dbutils.fs.ls(path)
    return True
  except Exception as e:
    if 'java.io.FileNotFoundException' in str(e):
      return False
    else:
      raise

dughub
New Contributor II

Very many thanks to @zerogjoe​  for his elegant answer, which works perfectly for Databricks formatted file paths.

To make this a little more robust and allow for filesystem api paths (that can be used with os, glob etc and start with "/dbfs") I've added a few lines of code.

def exists(path): """ Check for existence of path within Databricks file system. """

if path[:5] == "/dbfs":
     import os
     return os.path.exists(path)
 else:
     try:
         dbutils.fs.ls(path)
         return True
     except Exception as e:
         if 'java.io.FileNotFoundException' in str(e):
             return False
         else:
             raise

Hari_Gopinath
New Contributor II

If you are looking for a scala solution, here it is:

def pathExists(tablePath: String): Boolean = {
  try{
    dbutils.fs.ls(tablePath)
    return true
  } catch {
    case e: java.io.FileNotFoundException => println("Given path cannot be located")
    return false
  }
}

Mustious
New Contributor II

You can do so by running the snippet below, which uses the new [databricks python SDK](https://github.com/databricks/databricks-sdk-py/😞

Install the package:
`pip install databricks-sdk`

Python snippet:
```python
from databricks.sdk import WorkspaceClient

# Remember to change the arguments below
w_client = WorkspaceClient(host="my_host", token="my_db_tokens")

# add an absolute path
dbfs_path_exist = w_client.dbfs.exists('/dbfs_my_path')
```

Hope it helps 🙂

Amit_Dass
New Contributor II

How to check if a file exists in DBFS?

Let's write a Python function to check if the file exists or not

-------------------------------------------------------------

def file_exists(path):

    try:

        dbutils.fs.ls(path)

        return True

    except Exception as e:

        if 'java.io.FileNotFoundException' in str(e):

            return False

        else:

            raise

 

Result = file_exists("dbfs:/Repos/")

print(Result)

"True" represents that path exists  

Amit_Dass_0-1705019439413.png

 

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.