cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Bie1234
by New Contributor III
  • 1305 Views
  • 2 replies
  • 3 kudos

Resolved! accidently delete paquet file in dbfs

I accidently  delete manual paquet file in dbfs how can I recovery this recovery this file

  • 1305 Views
  • 2 replies
  • 3 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 3 kudos

Hi @pansiri panaudom​ ,There is no option restore deleted files in databricks .

  • 3 kudos
1 More Replies
User16783853906
by Contributor III
  • 879 Views
  • 1 replies
  • 1 kudos

Understanding file retention with Vacuum

I have seen few instances where users reported that they run OPTIMIZE for the past week worth of data and they follow by VACUUM with RETAIN of 168 HOURS (for example), the old files aren't being deleted, "VACUUM is not removing old files from the tab...

  • 879 Views
  • 1 replies
  • 1 kudos
Latest Reply
Priyanka_Biswas
Valued Contributor
  • 1 kudos

Hello @Venkatesh Kottapalli​ VACUUM removes all files from the table directory that are not managed by Delta, as well as data files that are no longer in the latest state of the transaction log for the table and are older than a retention threshold. ...

  • 1 kudos
chanansh
by Contributor
  • 2817 Views
  • 9 replies
  • 9 kudos

copy files from azure to s3

I am trying to copy files from azure to s3. I've created a solution by comparing file lists and copy manually to a temp file and upload. However, I just found AutoLoader and I would like to use that https://docs.databricks.com/ingestion/auto-loader/i...

  • 2817 Views
  • 9 replies
  • 9 kudos
Latest Reply
Falokun
New Contributor II
  • 9 kudos

Just use tools like Goodsync and Gs Richcopy 360 to copy directly from blob to S3, I think you will never face problems like that ​

  • 9 kudos
8 More Replies
andrew0117
by Contributor
  • 5176 Views
  • 1 replies
  • 0 kudos

Resolved! How to read a local file using Databricks( file stored in your own computer)

without uploading the file into dbfs? Thanks!

  • 5176 Views
  • 1 replies
  • 0 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 0 kudos

In my opinion, it doesn't make sense, but...you can Mount SMB Azure file share on a Windows Machine https://learn.microsoft.com/en-us/azure/storage/files/storage-how-to-use-files-windows and then mount the same folder on databricks using pip install ...

  • 0 kudos
su
by New Contributor
  • 2174 Views
  • 3 replies
  • 0 kudos

Reading from /tmp no longer working

Since yesterday, reading a file copied into the cluster is no longer working.What used to work:blob = gcs_bucket.get_blob("dev/data.ndjson") -> worksblob.download_to_filename("/tmp/data-copy.ndjson") -> worksdf = spark.read.json("/tmp/data-copy.ndjso...

  • 2174 Views
  • 3 replies
  • 0 kudos
Latest Reply
Evan_From_Bosto
New Contributor II
  • 0 kudos

I encountered this same issue, and figured out a fix!For some reason, it seems like only %sh cells can access the /tmp directory. So I just did...%sh ch /tmp/<file> /dbfs/<desired-location> and then accessed it form there using Spark.

  • 0 kudos
2 More Replies
semi
by New Contributor II
  • 1131 Views
  • 3 replies
  • 3 kudos

Access file location problem

import pandas as pd from apiclient.discovery import build from oauth2client.service_account import ServiceAccountCredentials df = spark.read.json("/FileStore/tables/cert.json")   SCOPES = ['https://www.googleapis.com/auth/analytics.readonly'] KEY_FIL...

  • 1131 Views
  • 3 replies
  • 3 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 3 kudos

Looks like it is because the oauth2client.service_account does not know about DBFS (whereas spark does).Is it an option to manage your secrets in databricks? https://docs.databricks.com/security/secrets/secrets.html

  • 3 kudos
2 More Replies
Sharmila04
by New Contributor
  • 1603 Views
  • 3 replies
  • 0 kudos

DBFS File Browser Error RESOURCE_DOES_NOT_EXIST:

Hi,I am new to databricks, and was trying to follow some tutorial to upload a file and move it under some different folder. I used DBFS option.While trying to move/rename the file I am getting below error, can you please help to understand why I am g...

image
  • 1603 Views
  • 3 replies
  • 0 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 0 kudos

use these three commands and it will workdbutils.fs.ls('dbfs:/FileStore/vehicle_data.csv')dbutils.fs.ls('/dbfs/FileStore/vehicle_data.csv')dbutils.fs.ls('/dbfs/dbfs/FileStore/vehicle_data.csv')ThanksAviral

  • 0 kudos
2 More Replies
KrishZ
by Contributor
  • 2550 Views
  • 4 replies
  • 1 kudos

How to print the path of a .py file or a notebook?

I have stored a test.py in the dbfs at the below location "/dbfs/FileStore/shared_uploads/krishna@company.com/Project_Folder/test.py"I have a print statement in test.py which says the belowprint( os.getcwd() )and it prints the below'/databricks/drive...

  • 2550 Views
  • 4 replies
  • 1 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 1 kudos

Hey @Krishna Zanwar​  Please use the below code this will work and as you want the specific location you can create a custom code and format the path using a python formatter , it will give you desired result .

  • 1 kudos
3 More Replies
KVNARK
by Honored Contributor II
  • 679 Views
  • 1 replies
  • 5 kudos

Resolved! Trigger another .py file by uisng 2 .py files.

Hi,I have 3 .py files - a.py, b.py & c.py files. By joining a.py & b.py, based on the output that I get need to trigger the c.py file.

  • 679 Views
  • 1 replies
  • 5 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 5 kudos

Hi @KVNARK .​ refer below link this will help in thisLink

  • 5 kudos
RajibRajib_Mand
by New Contributor III
  • 2032 Views
  • 2 replies
  • 0 kudos

Reading Password protected excel(.xlsx) file in databricks

I want to read password protected excel file and load the data delta table.Can you pleas let me know how this can be achieved in databricks?

  • 2032 Views
  • 2 replies
  • 0 kudos
Latest Reply
igorsalo22
New Contributor II
  • 0 kudos

df = spark.read.format("com.crealytics.spark.excel")\  .option("dataAddress", "'Base'!A1")\  .option("header", "true")\  .option("workbookPassword", "test")\  .load("test.xlsx")display(df)

  • 0 kudos
1 More Replies
db-avengers2rul
by Contributor II
  • 1124 Views
  • 1 replies
  • 0 kudos

Resolved! zip file not able to import in workspace

Dear Team,Using the community edition when i tried to import a zip file it is always throwing some error

  • 1124 Views
  • 1 replies
  • 0 kudos
Latest Reply
db-avengers2rul
Contributor II
  • 0 kudos

Please refer to the error in the attachment my question is this restriction is only for community edition ? or also for premium account ?

  • 0 kudos
hare
by New Contributor III
  • 7623 Views
  • 1 replies
  • 1 kudos

Failed to merge incompatible data types

We are processing the josn file from the storage location on every day and it will get archived once the records are appended into the respective tables.source_location_path: "..../mon=05/day=01/fld1" , "..../mon=05/day=01/fld2" ..... "..../mon=05/d...

  • 7623 Views
  • 1 replies
  • 1 kudos
Latest Reply
Shalabh007
Honored Contributor
  • 1 kudos

@Hare Krishnan​ the issues highlighted can easily be handled using the .option("mergeSchema", "true") at the time of reading all the files.Sample code:spark.read.option("mergeSchema", "true").json(<file paths>, multiLine=True)The only scenario this w...

  • 1 kudos
AmineHY
by Contributor
  • 5031 Views
  • 7 replies
  • 9 kudos

Resolved! How to read JSON files embedded in a list of lists?

HelloI am trying to read this JSON file but didn't succeed  You can see the head of the file, JSON inside a list of lists. Any idea how to read this file?

image image image
  • 5031 Views
  • 7 replies
  • 9 kudos
Latest Reply
AmineHY
Contributor
  • 9 kudos

Here is my solution, I am sure it can be optimizedimport json  data=[] with open(path_to_json_file, 'r') as f:   data.extend(json.load(f))   df = spark.createDataFrame(data[0], schema=schema)

  • 9 kudos
6 More Replies
rajat1
by New Contributor
  • 9783 Views
  • 3 replies
  • 2 kudos

How to convert dataframe (df), to a excel file that I can share with my colleagues ?

I am working on microsoft azure databrick, I have a final dataframe of shape (3276*23) , I want to share it in form of excel file? How can I do it ( I am using ->df.to_excel('fileOutput.xlsx', sheet_name = 'Sheet1', index = False) , command is runn...

  • 9783 Views
  • 3 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

You could try this way, convert Pyspark Dataframe to Pandas Dataframe then export to excel file.

  • 2 kudos
2 More Replies
sreedata
by New Contributor III
  • 2646 Views
  • 5 replies
  • 12 kudos

Resolved! Date field getting changed when reading from excel file to dataframe

The date field is getting changed while reading data from source .xls file to the dataframe. In the source xl file all columns are strings but i am not sure why date column alone behaves differentlyIn Source file date is 1/24/2022.In dataframe it is ...

  • 2646 Views
  • 5 replies
  • 12 kudos
Latest Reply
Pradeep_Namani
New Contributor III
  • 12 kudos

Hi Team, @Merca Ovnerud​ I am also facing same issue , below is the code snippet which I am using df=spark.read.format("com.crealytics.spark.excel").option("header","true").load("/mnt/dataplatform/Tenant_PK/Results.xlsx")I have a couple of date colum...

  • 12 kudos
4 More Replies
Labels