cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Data_Engineer3
by Contributor II
  • 3136 Views
  • 2 replies
  • 7 kudos

Move folder from dbfs location to user workspace directory in azure databricks

I need to move group of files(python or scala file from)or folder from dbfs location to user workspace directory in azure databricks to do testing on file.Its verify difficult to upload each file one by one into the user workspace directory, so is it...

  • 3136 Views
  • 2 replies
  • 7 kudos
Latest Reply
Kaniz
Community Manager
  • 7 kudos

Hi @KARTHICK N​, We haven’t heard from you on the last response from @Werner Stinckens​​, and I was checking back to see if his suggestions helped you. Or else, If you have any solution, please do share that with the community as it can be helpful to...

  • 7 kudos
1 More Replies
Surendra
by New Contributor III
  • 5030 Views
  • 5 replies
  • 8 kudos

Resolved! Databricks notebook is taking 2 hours to write to /dbfs/mnt (blob storage). Same job is taking 8 minutes to write to /dbfs/FileStore. I would like to understand why write performance is different in both cases.

Problem statement:Source file format : .tar.gzAvg size: 10 mbnumber of tar.gz files: 1000Each tar.gz file contails around 20000 csv files.Requirement : Untar the tar.gz file and write CSV files to blob storage / intermediate storage layer for further...

databricks_write_to_dbfsMount databricks_write_to_dbfsMount
  • 5030 Views
  • 5 replies
  • 8 kudos
Latest Reply
Kaniz
Community Manager
  • 8 kudos

Hi @Hubert Dudek​ , I Just wanted to thank you. We’re so lucky to have customers like you!The way you are helping our community is incredible.

  • 8 kudos
4 More Replies
rajib76
by New Contributor II
  • 1430 Views
  • 1 replies
  • 2 kudos

Resolved! DBFS with Google Cloud Storage(GCS)

Does DBFS support GCS?

  • 1430 Views
  • 1 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

Yes you need just to create service account for databricks and than assign storage admin role to bucket. After that you can mount GCS standard way:bucket_name = "<bucket-name>" mount_name = "<mount-name>" dbutils.fs.mount("gs://%s" % bucket_name, "/m...

  • 2 kudos
study_community
by New Contributor III
  • 6469 Views
  • 13 replies
  • 4 kudos

Resolved! Not able to move files from local to dbfs through dbfs CLI

Hi Folks,I have installed and configured databricks CLI in my local machine. I tried to move a local file from my personal computer using dbfs cp to dbfs:/ path. I can see the file is copied from local, and is only visible in local. I am not able to ...

image image
  • 6469 Views
  • 13 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

Hi, Could you try to save the file from your local machine to dbfs:/FileStore location?# Put local file test.py to dbfs:/FileStore/test.pydbfs cp test.py dbfs:/FileStore/test.py

  • 4 kudos
12 More Replies
snoeprol
by New Contributor II
  • 3252 Views
  • 4 replies
  • 2 kudos

Resolved! Unable to open files with python, but filesystem shows files exist

Dear community,I have the following problem:%fs mv '/FileStore/Tree_point_classification-1.dlpk' '/dbfs/mnt/group22/Tree_point_classification-1.dlpk'I have uploaded a file of a ML-model and have transferred it to the directory with When I now check ...

  • 3252 Views
  • 4 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

There is dbfs:/dbfs/ displayed maybe file is in /dbfs/dbfs directory? Please check it and try to open with open('/dbfs/dbfs. You can also use "data" from left menu to check what is in dbfs file system more easily.

  • 2 kudos
3 More Replies
hoopla
by New Contributor II
  • 4250 Views
  • 3 replies
  • 1 kudos

Unable to copy mutiple files from file:/tmp to dbfs:/tmp

I am downloading multiple files by web scraping and by default they are stored in /tmp I can copy a single file by providing the filename and path %fs cp file:/tmp/2020-12-14_listings.csv.gz dbfs:/tmp but when I try to copy multiple files I get an ...

  • 4250 Views
  • 3 replies
  • 1 kudos
Latest Reply
hoopla
New Contributor II
  • 1 kudos

Thanks DeepakThis is what I have suspected.Hopefully the wild card feature might be available in futureThanks

  • 1 kudos
2 More Replies
rami1
by New Contributor II
  • 1114 Views
  • 1 replies
  • 0 kudos

Missing Databricks Datasets

Hi, I am looking at my Databricks workspace and it looks like I am missing DBFS Databricks-dataset root folder. The dbfs root folders I can view are FileStore, local_disk(),mnt, pipelines and user. Can I mount Databricks-dataset or am I missing some...

  • 1114 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ryan_Chynoweth
Honored Contributor III
  • 0 kudos

If you run the following command do you receive an error? Or do you just get an empty list?dbutils.fs.ls("/databricks-datasets")

  • 0 kudos
hravilla
by New Contributor
  • 2128 Views
  • 1 replies
  • 0 kudos

Upload file to DBFS fails with error code 0

When trying to upload to DBFS from local machine getting error as "Error occurred when processing file ... : Server responded with 0 code" DBR 7.3 LTSSpark 3.0.1 Scala 2.12 Uploading the file using the "upload" in the Databricks cloud console, the c...

  • 2128 Views
  • 1 replies
  • 0 kudos
Latest Reply
PramodNaik
New Contributor II
  • 0 kudos

Even I am facing the same issue with GCP databricks. I am able to upload files with smaller size. When i tried with 3MB file, databricks chokes. I get the above error. I tried with aws databricks, it works good even for bigger size files.

  • 0 kudos
smanickam
by New Contributor II
  • 12480 Views
  • 5 replies
  • 3 kudos

com.databricks.sql.io.FileReadException: Error while reading file dbfs:

I ran the below statement and got the error %python data = sqlContext.read.parquet("/FileStore/tables/ganesh.parquet") display(data) Error: SparkException: Job aborted due to stage failure: Task 0 in stage 27.0 failed 1 times, most recent failure:...

  • 12480 Views
  • 5 replies
  • 3 kudos
Latest Reply
MatthewSzafir
New Contributor III
  • 3 kudos

I'm having a similar issue reading a JSON file. It is ~550MB compressed and is on a single line: val cfilename = "c_datafeed_20200128.json.gz" val events = spark.read.json(s"/mnt/c/input1/$cfilename") display(events) The filename is correct and t...

  • 3 kudos
4 More Replies
AnaDel_Campo_Me
by New Contributor
  • 9899 Views
  • 2 replies
  • 1 kudos

FileNotFoundError: [Errno 2] No such file or directory or IsADirectoryError: [Errno 21] Is a directory

I have been trying to open a file on the dbfs using all different combinations: if I use the following code: with open("/dbfs/FileStore/df/Downloadedfile.csv", 'r', newline='') as f I get IsADirectoryError: [Errno 21] Is a directory with open("dbfs:...

  • 9899 Views
  • 2 replies
  • 1 kudos
Latest Reply
paulmark
New Contributor II
  • 1 kudos

To get rid of this error you can try using Python file exists methods to check that at least python sees the file exists or not. In other words, you can make sure that the user has indeed typed a correct path for a real existing file. If the user do...

  • 1 kudos
1 More Replies
NandhaKumar
by New Contributor II
  • 3458 Views
  • 3 replies
  • 0 kudos

How to specify multiple files in --py-files in spark-submit command for databricks job? All the files to be specified in --py-files present in dbfs: .

I have created a databricks in azure. I have created a cluster for python 3. I am creating a job using spark-submit parameters. How to specify multiple files in --py-files in spark-submit command for databricks job? All the files to be specified in ...

  • 3458 Views
  • 3 replies
  • 0 kudos
Latest Reply
shyam_9
Valued Contributor
  • 0 kudos

Hi @Nandha Kumar,please go through the below docs to pass python files as job,https://docs.databricks.com/dev-tools/api/latest/jobs.html#sparkpythontask

  • 0 kudos
2 More Replies
Labels