cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Downloading and storing a PDF file to FileStore not working

msa_2j212
New Contributor II

I'm trying to download a PDF file and store it in FileStore using this code in a Notebook:

 

with open('/dbfs/FileStore/file.pdf', 'wb') as f:
    f.write(requests.get('https://url.com/file.pdf').content)

 

But I'm getting this error:

FileNotFoundError: [Errno 2] No such file or directory

What am I doing wrong?

2 REPLIES 2

Brian2
New Contributor III

Might be easier to use curl commnad .. in a notebook you can run as shell command or python to first load the file into local driver temp storage

%sh curl https://url.com/file.pdf --output /tmp/file.pdf

 or in python

import urllib
urllib.request.urlretrieve("https://url.com/file.pdf", "/tmp/file.pdf.csv")

Then move the file to DBFS

dbutils.fs.mv("file:/tmp/file.pdf", "dbfs:/Filestore/file.pdf 

 

msa_2j212
New Contributor II

This worked, thanks. 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group