cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to check file exists in databricks

SimhadriRaju
New Contributor

I Have a while loop there i have to check a file exists or not if exists read the file in a data frame else go to another file

7 REPLIES 7

User16857282152
Contributor

There might be other ways to do this, but this works.

@Simhadri Raju

Basically,

I use dbutils.fs.head here, but anything that throws an exception if it fails to find the file would work.

So I go to read the first byte of the file with

dbutils.fs.head(arg1,1)

If that throws an exception I return False

If that succeeds I return True.

Put that in a function, call the function with your filename and you are good to go.

Full code here

## Function to check to see if a file exists

def fileExists (arg1): try: dbutils.fs.head(arg1,1) except: return False; else: return True;

Calling that function with your filename

ilename = <pathtoyourfile>

if(fileExists(filename)): print("Yes it exists");

Oh that code did not render well,

zerogjoe
New Contributor II

Here's a Python solution:

def file_exists(path):
  try:
    dbutils.fs.ls(path)
    return True
  except Exception as e:
    if 'java.io.FileNotFoundException' in str(e):
      return False
    else:
      raise

dughub
New Contributor II

Very many thanks to @zerogjoe​  for his elegant answer, which works perfectly for Databricks formatted file paths.

To make this a little more robust and allow for filesystem api paths (that can be used with os, glob etc and start with "/dbfs") I've added a few lines of code.

def exists(path): """ Check for existence of path within Databricks file system. """

if path[:5] == "/dbfs":
     import os
     return os.path.exists(path)
 else:
     try:
         dbutils.fs.ls(path)
         return True
     except Exception as e:
         if 'java.io.FileNotFoundException' in str(e):
             return False
         else:
             raise

Hari_Gopinath
Databricks Employee
Databricks Employee

If you are looking for a scala solution, here it is:

def pathExists(tablePath: String): Boolean = {
  try{
    dbutils.fs.ls(tablePath)
    return true
  } catch {
    case e: java.io.FileNotFoundException => println("Given path cannot be located")
    return false
  }
}

Mustious
New Contributor II

You can do so by running the snippet below, which uses the new [databricks python SDK](https://github.com/databricks/databricks-sdk-py/😞

Install the package:
`pip install databricks-sdk`

Python snippet:
```python
from databricks.sdk import WorkspaceClient

# Remember to change the arguments below
w_client = WorkspaceClient(host="my_host", token="my_db_tokens")

# add an absolute path
dbfs_path_exist = w_client.dbfs.exists('/dbfs_my_path')
```

Hope it helps 🙂

Amit_Dass
New Contributor II

How to check if a file exists in DBFS?

Let's write a Python function to check if the file exists or not

-------------------------------------------------------------

def file_exists(path):

    try:

        dbutils.fs.ls(path)

        return True

    except Exception as e:

        if 'java.io.FileNotFoundException' in str(e):

            return False

        else:

            raise

 

Result = file_exists("dbfs:/Repos/")

print(Result)

"True" represents that path exists  

Amit_Dass_0-1705019439413.png

 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group