cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

How to read csv files stored in my Databricks workspace using a Python script in my local computer?

alexkychen
New Contributor II

I am developing a Python app on my local computer, and I would like to let it read some data stored in my Databricks workspace using preferably Pandas. The data are stored in .csv files in the workspace. How can I make this happen? Is it possible to achieve via file URL? A code snippet would be appreciated! Thanks!

2 REPLIES 2

eniwoke
New Contributor II

Hi alexkychen, assuming you have the file saved in DBFS in your Databricks workspace, you can read the file by getting the file's contents in DBFS via the Databricks API -> https://docs.databricks.com/api/workspace/dbfs/read

Here is a simple Python snippet that allows you to achieve this locally. This snippet uses a Personal access token, and prints the base64 encoded content of the file.

import requests
import json

DATABRICKS_HOST = 'https://<FILL_IN_DATABRICKS_HOST>'
DATABRICKS_TOKEN = '<FILL_IN_TOKEN>'

reqUrl = f"{DATABRICKS_HOST}/api/2.0/dbfs/read"

headersList = {
 f"Authorization": "Bearer {DATABRICKS_TOKEN}",
 "Content-Type": "application/json" 
}

payload = json.dumps({
  "path":"/dbfs/tmp/example_folder/test.csv"
})

response = requests.request("GET", reqUrl, data=payload,  headers=headersList)

# Print the content, which is Base64 encoded
print(response.text)

 Hope this helps ๐Ÿ™‚

Eni

alexkychen
New Contributor II

Hi Eni,

Thank you very much for your reply. I also did some research, but realized that storing sensitive data (which is in my case) in DBFS is no longer recommended by Databricks due to security reason as it states here: https://docs.databricks.com/en/files/index.html#work-with-files-in-dbfs-mounts-and-dbfs-root. I will look for other solutions to better store the data on Databricks and can be accessed locally and securely. 

Anyway, your reply is much appreciated!    

 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group