Data Engineering

Forum Posts

Sorted by:

by Tonny_Stark • New Contributor III

04-05-2023 4:06:02 PM

12562 Views
7 replies
1 kudos

FileNotFoundError: [Errno 2] No such file or directory: when I try to unzip .tar or .zip files it gives me this error

Hello, how are you? I have a small problem. I need to unzip some .zip, tar files. and gz inside these may have multiple files trying to unzip the .zip files i got this errorFileNotFoundError: [Errno 2] No such file or directory: but the files are in ...

Data Engineering

12562 Views
7 replies
1 kudos

04-05-2023 4:06:02 PM

View Replies

Latest Reply

Anonymous
Not applicable

04-07-2023 11:46:17 PM

1 kudos

Hi @Alfredo Vallejos Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feed...

1 kudos

04-07-2023 11:46:17 PM

6 More Replies

by RantoB • Valued Contributor

10-29-2021 4:08:48 AM

24256 Views
8 replies
7 kudos

Resolved! read csv directly from url with pyspark

I would like to load a csv file directly to a spark dataframe in Databricks. I tried the following code :url = "https://opendata.reseaux-energies.fr/explore/dataset/eco2mix-national-tr/download/?format=csv&timezone=Europe/Berlin&lang=fr&use_labels_fo...

Data Engineering

24256 Views
8 replies
7 kudos

10-29-2021 4:08:48 AM

View Replies

Latest Reply

anwangari
New Contributor II

12-10-2024 1:41:13 AM

7 kudos

Hello it's end of 2024 and I still have this issue with python. As mentioned sc method nolonger works. Also, working with volumes within "/databricks/driver/" is not supported in Apache Spark.ALTERNATIVE SOLUTION: Use requests to download the file fr...

7 kudos

12-10-2024 1:41:13 AM

7 More Replies

by Jana • New Contributor III

02-15-2022 9:26:54 AM

8718 Views
9 replies
4 kudos

Resolved! Parsing 5 GB json file is running long on cluster

I was creating delta table from ADLS json input file. but the job was running long while creating delta table from json. Below is my cluster configuration. Is the issue related to cluster config ? Do I need to upgrade the cluster config ?The cluster ...

Data Engineering

8718 Views
9 replies
4 kudos

02-15-2022 9:26:54 AM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

03-01-2022 12:48:29 AM

4 kudos

with multiline = true, the json is read as a whole and processed as such.I'd try with a beefier cluster.

4 kudos

03-01-2022 12:48:29 AM

8 More Replies

by AmineHY • Contributor

11-16-2022 5:24:01 AM

11858 Views
5 replies
6 kudos

Resolved! How to read JSON files embedded in a list of lists?

HelloI am trying to read this JSON file but didn't succeed You can see the head of the file, JSON inside a list of lists. Any idea how to read this file?

Data Engineering

11858 Views
5 replies
6 kudos

11-16-2022 5:24:01 AM

View Replies

Latest Reply

adriennn
Valued Contributor

09-12-2024 10:32:36 PM

6 kudos

The correct way to do this without using open, which will work only with local/mounted files is to read the files as binaryfile and then you will get the entire json string on each row, from there you can use from_json() and explode() to extract the ...

6 kudos

09-12-2024 10:32:36 PM

4 More Replies

by Venky • New Contributor III

10-26-2021 1:40:50 AM

86899 Views
18 replies
19 kudos

Resolved! i am trying to read csv file using databricks, i am getting error like ......FileNotFoundError: [Errno 2] No such file or directory: '/dbfs/FileStore/tables/world_bank.csv'

i am trying to read csv file using databricks, i am getting error like ......FileNotFoundError: [Errno 2] No such file or directory: '/dbfs/FileStore/tables/world_bank.csv'

Data Engineering

86899 Views
18 replies
19 kudos

10-26-2021 1:40:50 AM

View Replies

Latest Reply

Alexis
New Contributor III

05-24-2022 11:52:02 PM

19 kudos

Hiyou can try: my_df = spark.read.format("csv") .option("inferSchema","true") # to get the types from your data .option("sep",",") # if your file is using "," as separator .option("header","true") # if you...

19 kudos

05-24-2022 11:52:02 PM

17 More Replies

by Jiri_Koutny • New Contributor III

11-25-2021 4:47:28 AM

6661 Views
11 replies
3 kudos

Delay in files update on filesystem

Hi, I noticed that there is quite a significant delay (2 - 10s) between making a change to some file in Repos via Databricks file edit window and propagation of such change to the filesystem. Our engineers and scientists use YAML config files. If the...

Data Engineering

6661 Views
11 replies
3 kudos

11-25-2021 4:47:28 AM

View Replies

Latest Reply

Irka
New Contributor II

06-25-2024 9:15:41 AM

3 kudos

Is there a solution to this?BTW, the "ls" command trick didn't work for me

3 kudos

06-25-2024 9:15:41 AM

10 More Replies

by nyehia • Contributor

04-20-2023 9:14:40 AM

6295 Views
9 replies
0 kudos

Can not access a sql file from Notebook

Hey,I have a repo of notebooks and SQL files, the typical way is to update/create notebooks in the repo then push it and CICD pipeline deploys the notebooks to the Shared workspace.the issue is that I can access the SQL files in the Repo but can not ...

Data Engineering

6295 Views
9 replies
0 kudos

04-20-2023 9:14:40 AM

View Replies

Latest Reply

ok_1
New Contributor II

04-13-2024 3:15:00 AM

0 kudos

0 kudos

04-13-2024 3:15:00 AM

8 More Replies

by SimhadriRaju • New Contributor

07-25-2019 12:34:46 AM

54958 Views
7 replies
0 kudos

How to check file exists in databricks

I Have a while loop there i have to check a file exists or not if exists read the file in a data frame else go to another file

Data Engineering

54958 Views
7 replies
0 kudos

07-25-2019 12:34:46 AM

View Replies

Latest Reply

Amit_Dass
New Contributor II

01-11-2024 4:31:01 PM

0 kudos

How to check if a file exists in DBFS?Let's write a Python function to check if the file exists or not-------------------------------------------------------------def file_exists(path): try: dbutils.fs.ls(path) return True except ...

0 kudos

01-11-2024 4:31:01 PM

6 More Replies

by learnerbricks • New Contributor II

09-14-2022 1:11:16 AM

6533 Views
4 replies
0 kudos

Unable to save file in DBFS

I have took the azure datasets that are available for practice. I got the 10 days data from that dataset and now I want to save this data into DBFS in csv format. I have facing an error :" No such file or directory: 'No such file or directory: '/dbfs...

Data Engineering

6533 Views
4 replies
0 kudos

09-14-2022 1:11:16 AM

View Replies

Latest Reply

pardosa
New Contributor II

11-14-2023 8:34:06 PM

0 kudos

Hi,after some exercise you need to aware folder create in dbutils.fs.mkdirs("/dbfs/tmp/myfolder") it's created in /dbfs/dbfs/tmp/myfolderif you want to access path to_csv("/dbfs/tmp/myfolder/mytest.csv") you should created with this script dbutils.fs...

0 kudos

11-14-2023 8:34:06 PM

3 More Replies

by kinsun • New Contributor II

04-27-2023 7:58:46 AM

22785 Views
5 replies
1 kudos

Resolved! DBFS and Local File System Doubts

Dear Databricks Expert,I got some doubts when dealing with DBFS and Local File System.Case01: Copy a file from ADLS to DBFS. I am able to do so through the below python codes:#spark.conf.set("fs.azure.account.auth.type", "OAuth") spark.conf.set("fs.a...

Data Engineering

22785 Views
5 replies
1 kudos

04-27-2023 7:58:46 AM

View Replies

Latest Reply

Anonymous
Not applicable

04-28-2023 10:34:27 PM

1 kudos

Hi @KS LAU Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your q...

1 kudos

04-28-2023 10:34:27 PM

4 More Replies

by CrisCampos • New Contributor II

11-22-2022 1:07:43 PM

3738 Views
1 replies
1 kudos

How to load a "pickle/joblib" file on Databricks

Hi Community, I am trying to load a joblib on Databricks, but doesn't seems to be working.Getting an error message: "Incompatible format detected" Any idea of how to load this type of file on db?Thanks!

Data Engineering

3738 Views
1 replies
1 kudos

11-22-2022 1:07:43 PM

View Replies

Latest Reply

tapash-db
Databricks Employee

08-07-2023 11:36:02 AM

1 kudos

You can import joblib/joblibspark package to load joblib files

1 kudos

08-07-2023 11:36:02 AM

by Data_Engineer_3 • New Contributor III

10-27-2021 8:14:48 AM

19310 Views
12 replies
4 kudos

FileNotFoundError: [Errno 2] No such file or directory: '/FileStore/tables/flight_data.zip' The data and file exists in location mentioned above

I am new to learning Spark and working on some practice; I have uploaded a zip file in DBFS /FileStore/tables directory and trying to run a python code to unzip the file; The python code is as: from zipfile import *with ZipFile("/FileStore/tables/fli...

Data Engineering

19310 Views
12 replies
4 kudos

10-27-2021 8:14:48 AM

View Replies

Latest Reply

883022
New Contributor II

07-19-2023 8:36:48 AM

4 kudos

What if changing the runtime is not an option? I'm experiencing a similar issue using the following:%pip install -r /dbfs/path/to/file.txtThis worked for a while, but now I'm getting the Errno 2 mentioned above. I am still able to print the same file...

4 kudos

07-19-2023 8:36:48 AM

11 More Replies

by GC-James • Contributor II

03-04-2022 7:34:53 AM

16391 Views
15 replies
5 kudos

Resolved! Lost memory when using dbutils

Why does copying a 9GB file from a container to the /dbfs lose me 50GB of memory? (Which doesn't come back until I restarted the cluster)

Data Engineering

16391 Views
15 replies
5 kudos

03-04-2022 7:34:53 AM

View Replies

Latest Reply

AdrianP
New Contributor II

07-11-2023 1:56:28 AM

5 kudos

Hi James,Did you get to the bottom of this? We are experiencing the same issue, and all the suggested solutions don't seem to work.Thanks,Adrian

5 kudos

07-11-2023 1:56:28 AM

14 More Replies

by jch • New Contributor III

05-15-2023 2:48:30 PM

7886 Views
4 replies
5 kudos

Resolved! Why does spark.read.csv come back with an error: com.databricks.sql.io.FileReadException: Error while reading file dbfs:/mnt/cntnr/demo/circuits.csv ?

I need help understanding why I can't open a file.In a databricks notebook, I use this code:%fs ls /mnt/cntnr/demoI get back dbfs:/mnt/cntnr/demo/circuits.csv as one of the path values.When I use this code, I get an error:circuits_df = spark.read....

Data Engineering

7886 Views
4 replies
5 kudos

05-15-2023 2:48:30 PM

View Replies

Latest Reply

jch
New Contributor III

06-21-2023 5:56:15 AM

5 kudos

It turns out my spark config was wrong #Set Spark configuration configs = {"fs.azure.account.auth.type": "OAuth", "fs.azure.account.oauth.provider.type": "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider", "fs.azu...

5 kudos

06-21-2023 5:56:15 AM

3 More Replies

by teng_shin_lim • New Contributor

06-19-2023 10:23:35 PM

2153 Views
1 replies
1 kudos

Having issue trying to download a csv file from a website using FireFox Selenium.

Hi, When I clicked on the download button from a website thru Firefox selenium using element.click(), and the download destination is being set as Azure datalake storage. Then, after the download started, those .csv and .csv.part files never gotten m...

Data Engineering

2153 Views
1 replies
1 kudos

06-19-2023 10:23:35 PM

View Replies

Latest Reply

Anonymous
Not applicable

06-20-2023 8:13:06 PM

1 kudos

Hi @Brandon Lim Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

1 kudos

06-20-2023 8:13:06 PM