Topics with Label: File

Forum Posts

Sorted by:

Start a conversation

by Bie1234 • New Contributor III

02-15-2023 12:51:51 AM

1305 Views
2 replies
3 kudos

Resolved! accidently delete paquet file in dbfs

I accidently delete manual paquet file in dbfs how can I recovery this recovery this file

Data Engineering

1305 Views
2 replies
3 kudos

02-15-2023 12:51:51 AM

View Replies

Latest Reply

Ajay-Pandey
Esteemed Contributor III

02-15-2023 1:24:20 AM

3 kudos

Hi @pansiri panaudom ,There is no option restore deleted files in databricks .

3 kudos

02-15-2023 1:24:20 AM

1 More Replies

by User16783853906 • Contributor III

06-07-2021 2:14:43 PM

879 Views
1 replies
1 kudos

Understanding file retention with Vacuum

I have seen few instances where users reported that they run OPTIMIZE for the past week worth of data and they follow by VACUUM with RETAIN of 168 HOURS (for example), the old files aren't being deleted, "VACUUM is not removing old files from the tab...

Data Engineering

879 Views
1 replies
1 kudos

06-07-2021 2:14:43 PM

View Replies

Latest Reply

Priyanka_Biswas
Valued Contributor

01-31-2023 7:46:06 PM

1 kudos

Hello @Venkatesh Kottapalli VACUUM removes all files from the table directory that are not managed by Delta, as well as data files that are no longer in the latest state of the transaction log for the table and are older than a retention threshold. ...

1 kudos

01-31-2023 7:46:06 PM

by chanansh • Contributor

01-11-2023 9:46:12 AM

2817 Views
9 replies
9 kudos

copy files from azure to s3

I am trying to copy files from azure to s3. I've created a solution by comparing file lists and copy manually to a temp file and upload. However, I just found AutoLoader and I would like to use that https://docs.databricks.com/ingestion/auto-loader/i...

Data Engineering

2817 Views
9 replies
9 kudos

01-11-2023 9:46:12 AM

View Replies

Latest Reply

Falokun
New Contributor II

01-20-2023 5:06:27 AM

9 kudos

Just use tools like Goodsync and Gs Richcopy 360 to copy directly from blob to S3, I think you will never face problems like that

9 kudos

01-20-2023 5:06:27 AM

8 More Replies

by andrew0117 • Contributor

01-11-2023 7:09:29 AM

5176 Views
1 replies
0 kudos

Resolved! How to read a local file using Databricks( file stored in your own computer)

without uploading the file into dbfs? Thanks!

Data Engineering

5176 Views
1 replies
0 kudos

01-11-2023 7:09:29 AM

View Replies

Latest Reply

Hubert-Dudek
Esteemed Contributor III

01-11-2023 7:18:33 AM

0 kudos

In my opinion, it doesn't make sense, but...you can Mount SMB Azure file share on a Windows Machine https://learn.microsoft.com/en-us/azure/storage/files/storage-how-to-use-files-windows and then mount the same folder on databricks using pip install ...

0 kudos

01-11-2023 7:18:33 AM

by su • New Contributor

10-14-2022 6:29:43 AM

2174 Views
3 replies
0 kudos

Reading from /tmp no longer working

Since yesterday, reading a file copied into the cluster is no longer working.What used to work:blob = gcs_bucket.get_blob("dev/data.ndjson") -> worksblob.download_to_filename("/tmp/data-copy.ndjson") -> worksdf = spark.read.json("/tmp/data-copy.ndjso...

Data Engineering

2174 Views
3 replies
0 kudos

10-14-2022 6:29:43 AM

View Replies

Latest Reply

Evan_From_Bosto
New Contributor II

01-05-2023 6:55:43 AM

0 kudos

I encountered this same issue, and figured out a fix!For some reason, it seems like only %sh cells can access the /tmp directory. So I just did...%sh ch /tmp/<file> /dbfs/<desired-location> and then accessed it form there using Spark.

0 kudos

01-05-2023 6:55:43 AM

2 More Replies

by semi • New Contributor II

12-21-2022 11:06:52 AM

1131 Views
3 replies
3 kudos

Access file location problem

import pandas as pd from apiclient.discovery import build from oauth2client.service_account import ServiceAccountCredentials df = spark.read.json("/FileStore/tables/cert.json") SCOPES = ['https://www.googleapis.com/auth/analytics.readonly'] KEY_FIL...

Data Engineering

1131 Views
3 replies
3 kudos

12-21-2022 11:06:52 AM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

12-22-2022 5:36:07 AM

3 kudos

Looks like it is because the oauth2client.service_account does not know about DBFS (whereas spark does).Is it an option to manage your secrets in databricks? https://docs.databricks.com/security/secrets/secrets.html

3 kudos

12-22-2022 5:36:07 AM

2 More Replies

by Sharmila04 • New Contributor

12-20-2022 7:15:50 PM

1603 Views
3 replies
0 kudos

DBFS File Browser Error RESOURCE_DOES_NOT_EXIST:

Hi,I am new to databricks, and was trying to follow some tutorial to upload a file and move it under some different folder. I used DBFS option.While trying to move/rename the file I am getting below error, can you please help to understand why I am g...

Data Engineering

1603 Views
3 replies
0 kudos

12-20-2022 7:15:50 PM

View Replies

Latest Reply

Aviral-Bhardwaj
Esteemed Contributor III

12-20-2022 8:55:45 PM

0 kudos

use these three commands and it will workdbutils.fs.ls('dbfs:/FileStore/vehicle_data.csv')dbutils.fs.ls('/dbfs/FileStore/vehicle_data.csv')dbutils.fs.ls('/dbfs/dbfs/FileStore/vehicle_data.csv')ThanksAviral

0 kudos

12-20-2022 8:55:45 PM

2 More Replies

by KrishZ • Contributor

12-15-2022 2:56:57 PM

2550 Views
4 replies
1 kudos

How to print the path of a .py file or a notebook?

I have stored a test.py in the dbfs at the below location "/dbfs/FileStore/shared_uploads/krishna@company.com/Project_Folder/test.py"I have a print statement in test.py which says the belowprint( os.getcwd() )and it prints the below'/databricks/drive...

Data Engineering

2550 Views
4 replies
1 kudos

12-15-2022 2:56:57 PM

View Replies

Latest Reply

Aviral-Bhardwaj
Esteemed Contributor III

12-16-2022 7:44:50 AM

1 kudos

Hey @Krishna Zanwar Please use the below code this will work and as you want the specific location you can create a custom code and format the path using a python formatter , it will give you desired result .

1 kudos

12-16-2022 7:44:50 AM

3 More Replies

by KVNARK • Honored Contributor II

12-15-2022 6:30:43 PM

679 Views
1 replies
5 kudos

Resolved! Trigger another .py file by uisng 2 .py files.

Hi,I have 3 .py files - a.py, b.py & c.py files. By joining a.py & b.py, based on the output that I get need to trigger the c.py file.

Data Engineering

679 Views
1 replies
5 kudos

12-15-2022 6:30:43 PM

View Replies

Latest Reply

Ajay-Pandey
Esteemed Contributor III

12-15-2022 10:52:04 PM

5 kudos

Hi @KVNARK . refer below link this will help in thisLink

5 kudos

12-15-2022 10:52:04 PM

by RajibRajib_Mand • New Contributor III

01-03-2022 3:36:03 AM

2032 Views
2 replies
0 kudos

Reading Password protected excel(.xlsx) file in databricks

I want to read password protected excel file and load the data delta table.Can you pleas let me know how this can be achieved in databricks?

Data Engineering

2032 Views
2 replies
0 kudos

01-03-2022 3:36:03 AM

View Replies

Latest Reply

igorsalo22
New Contributor II

11-30-2022 10:11:12 AM

0 kudos

df = spark.read.format("com.crealytics.spark.excel")\ .option("dataAddress", "'Base'!A1")\ .option("header", "true")\ .option("workbookPassword", "test")\ .load("test.xlsx")display(df)

0 kudos

11-30-2022 10:11:12 AM

1 More Replies

by db-avengers2rul • Contributor II

11-29-2022 5:18:27 AM

1124 Views
1 replies
0 kudos

Resolved! zip file not able to import in workspace

Dear Team,Using the community edition when i tried to import a zip file it is always throwing some error

Data Engineering

1124 Views
1 replies
0 kudos

11-29-2022 5:18:27 AM

View Replies

Latest Reply

db-avengers2rul
Contributor II

11-29-2022 5:19:52 AM

0 kudos

Please refer to the error in the attachment my question is this restriction is only for community edition ? or also for premium account ?

0 kudos

11-29-2022 5:19:52 AM

by hare • New Contributor III

05-15-2022 11:13:34 PM

7623 Views
1 replies
1 kudos

Failed to merge incompatible data types

We are processing the josn file from the storage location on every day and it will get archived once the records are appended into the respective tables.source_location_path: "..../mon=05/day=01/fld1" , "..../mon=05/day=01/fld2" ..... "..../mon=05/d...

Data Engineering

7623 Views
1 replies
1 kudos

05-15-2022 11:13:34 PM

View Replies

Latest Reply

Shalabh007
Honored Contributor

11-29-2022 2:48:55 AM

1 kudos

@Hare Krishnan the issues highlighted can easily be handled using the .option("mergeSchema", "true") at the time of reading all the files.Sample code:spark.read.option("mergeSchema", "true").json(<file paths>, multiLine=True)The only scenario this w...

1 kudos

11-29-2022 2:48:55 AM

by AmineHY • Contributor

11-16-2022 5:24:01 AM

5031 Views
7 replies
9 kudos

Resolved! How to read JSON files embedded in a list of lists?

HelloI am trying to read this JSON file but didn't succeed You can see the head of the file, JSON inside a list of lists. Any idea how to read this file?

Data Engineering

5031 Views
7 replies
9 kudos

11-16-2022 5:24:01 AM

View Replies

Latest Reply

AmineHY
Contributor

11-21-2022 5:51:12 AM

9 kudos

Here is my solution, I am sure it can be optimizedimport json data=[] with open(path_to_json_file, 'r') as f: data.extend(json.load(f)) df = spark.createDataFrame(data[0], schema=schema)

9 kudos

11-21-2022 5:51:12 AM

6 More Replies

by rajat1 • New Contributor

09-27-2022 9:46:37 PM

9783 Views
3 replies
2 kudos

How to convert dataframe (df), to a excel file that I can share with my colleagues ?

I am working on microsoft azure databrick, I have a final dataframe of shape (3276*23) , I want to share it in form of excel file? How can I do it ( I am using ->df.to_excel('fileOutput.xlsx', sheet_name = 'Sheet1', index = False) , command is runn...

Data Engineering

9783 Views
3 replies
2 kudos

09-27-2022 9:46:37 PM

View Replies

Latest Reply

Anonymous
Not applicable

11-19-2022 2:15:43 AM

2 kudos

You could try this way, convert Pyspark Dataframe to Pandas Dataframe then export to excel file.

2 kudos

11-19-2022 2:15:43 AM

2 More Replies

by sreedata • New Contributor III

03-31-2022 6:47:42 AM

2646 Views
5 replies
12 kudos

Resolved! Date field getting changed when reading from excel file to dataframe

The date field is getting changed while reading data from source .xls file to the dataframe. In the source xl file all columns are strings but i am not sure why date column alone behaves differentlyIn Source file date is 1/24/2022.In dataframe it is ...

Data Engineering

2646 Views
5 replies
12 kudos

03-31-2022 6:47:42 AM

View Replies

Latest Reply

Pradeep_Namani
New Contributor III

11-17-2022 6:56:19 AM

12 kudos

Hi Team, @Merca Ovnerud I am also facing same issue , below is the code snippet which I am using df=spark.read.format("com.crealytics.spark.excel").option("header","true").load("/mnt/dataplatform/Tenant_PK/Results.xlsx")I have a couple of date colum...

12 kudos

11-17-2022 6:56:19 AM

4 More Replies