Topics with Label: Delta Format

Forum Posts

Sorted by:

by apiury • New Contributor III

06-22-2023 3:39:47 AM

2059 Views
4 replies
2 kudos

Delta file question

Hi! Im using Autoloader to ingest Binary files into delta format. I have 7 binary files but delta generate 3 files and the format is part-0000, part-0001... Why generate this files with format part-000...

Data Engineering

2059 Views
4 replies
2 kudos

06-22-2023 3:39:47 AM

View Replies

Latest Reply

Anonymous
Not applicable

06-22-2023 10:24:32 PM

2 kudos

Hi @Alejandro Piury Pinzón We haven't heard from you since the last response from @Lakshay Goel r, and I was checking back to see if her suggestions helped you.Or else, If you have any solution, please share it with the community, as it can be hel...

2 kudos

06-22-2023 10:24:32 PM

3 More Replies

by Ovi • New Contributor III

05-19-2023 8:51:12 AM

1205 Views
1 replies
0 kudos

Spark Dataframe write to Delta format doesn't create a _delta_log

Hello everyone, I have an intermittent issue when trying to create a Delta table for the first time in Databricks: all the data gets converted into parquet at the specified location but the _delta_log is not created or, if created, it's left empty, t...

Data Engineering

1205 Views
1 replies
0 kudos

05-19-2023 8:51:12 AM

View Replies

Latest Reply

jose_gonzalez
Moderator

06-06-2023 11:48:38 AM

0 kudos

Can you list (display) the folder location "deltaLocation"? what files do you see here? have you try to use a new location for testing? do you get the same behavior?

0 kudos

06-06-2023 11:48:38 AM

by pvignesh92 • Honored Contributor

05-26-2023 12:16:33 AM

512 Views
0 replies
0 kudos

Very often, we need to know how many files my table path contains and the overall size of the path for various optimizations. In the past, I had to wr...

Very often, we need to know how many files my table path contains and the overall size of the path for various optimizations. In the past, I had to write my own logic to accomplish this.Delta Lake is making life easier. See how simple it is to obtain...

Data Engineering

512 Views
0 replies
0 kudos

05-26-2023 12:16:33 AM

by wim_schmitz_per • New Contributor II

01-03-2023 9:53:04 AM

1771 Views
2 replies
2 kudos

Transforming/Saving Python Class Instances to Delta Rows

I'm trying to reuse a Python Package to do a very complex series of parsing binary files into workable data in Delta Format. I have made the first part (binary file parsing) work with a UDF:asffileparser = F.udf(File()._parseBytes,AsfFileDelta.getSch...

Data Engineering

1771 Views
2 replies
2 kudos

01-03-2023 9:53:04 AM

View Replies

Latest Reply

Debayan
Esteemed Contributor III

01-05-2023 9:45:32 PM

2 kudos

Hi, did you try to follow, "Fix it by registering a custom IObjectConstructor for this class."?Also, could you please provide us the full error?

2 kudos

01-05-2023 9:45:32 PM

1 More Replies

by joakon • New Contributor III

11-16-2022 8:15:17 PM

4538 Views
7 replies
6 kudos

Resolved! hi all - I have created a data frame and would like to save in delta format using df.write.format("delta").saveAsTable("tablename") Its not working and throws " AnalysisException" . Please advise .

Data Engineering

4538 Views
7 replies
6 kudos

11-16-2022 8:15:17 PM

View Replies

Latest Reply

huyd
New Contributor III

12-17-2022 6:28:59 AM

6 kudos

check your read cell, "Delimeter"

6 kudos

12-17-2022 6:28:59 AM

6 More Replies

by Biber • New Contributor III

09-22-2022 4:54:51 AM

1582 Views
5 replies
8 kudos

Resolved! Change schema when writing to the Delta format

Is it possible to reapply schema in delta files? For example, we have a history with field string but from some point, we need to replace string with struct.In my case merge option and overwrite schema don't work.

Data Engineering

1582 Views
5 replies
8 kudos

09-22-2022 4:54:51 AM

View Replies

Latest Reply

Biber
New Contributor III

10-09-2022 8:06:12 AM

8 kudos

Hi guys! Definitely, thank you for your support.

8 kudos

10-09-2022 8:06:12 AM

4 More Replies

by Anonymous • Not applicable

02-23-2022 1:47:24 AM

5293 Views
9 replies
6 kudos

Resolved! data frame takes unusually long time to write for small data sets

We have configured workspace with own vpc. We need to extract data from DB2 and write as delta format. we tried to for 550k records with 230 columns, it took 50mins to complete the task. 15mn records takes more than 18hrs. Not sure why this takes suc...

Data Engineering

5293 Views
9 replies
6 kudos

02-23-2022 1:47:24 AM

View Replies

Latest Reply

elgeo
Valued Contributor II

11-10-2022 3:14:42 AM

6 kudos

Hello. We face exactly the same issue. Reading is quick but writing takes long time. Just to clarify that it is about a table with only 700k rows. Any suggestions please? Thank youremote_table = spark.read.format ( "jdbc" ) \.option ( "driver" , "com...

6 kudos

11-10-2022 3:14:42 AM

8 More Replies

by StephanieRivera • Valued Contributor II

07-27-2022 10:32:53 AM

1062 Views
3 replies
6 kudos

Resolved! Does Delta format create a full copy of the data for each change I make to a table?

Data Engineering

1062 Views
3 replies
6 kudos

07-27-2022 10:32:53 AM

View Replies

Latest Reply

jose_gonzalez
Moderator

08-17-2022 2:24:08 PM

6 kudos

Hi @Stephanie Rivera,Just a friendly follow-up. Did any of the responses help you to resolve your question? if it did, please mark it as best. Otherwise, please let us know if you still need help.

6 kudos

08-17-2022 2:24:08 PM

2 More Replies

by Frankooo • New Contributor III

10-14-2021 11:54:53 AM

3326 Views
9 replies
7 kudos

How to optimize exporting dataframe to delta file?

Scenario : I have a dataframe that have 5 billion records/rows and 100+ columns. Is there a way to write this in a delta format efficiently. I have tried to export it but cancelled it after 2 hours (write didnt finish) as this processing time is not ...

Data Engineering

3326 Views
9 replies
7 kudos

10-14-2021 11:54:53 AM

View Replies

Latest Reply

Kaniz
Community Manager

05-18-2022 2:42:42 PM

7 kudos

Hi @Franco Sia , Just a friendly follow-up. Do you still need help or the above responses help you to find the solution? Please let us know.

7 kudos

05-18-2022 2:42:42 PM

8 More Replies

by KKo • Contributor III

01-24-2022 2:00:16 PM

3159 Views
5 replies
4 kudos

Resolved! Reading multiple parquet files from same _delta_log under a path

I have a path where there is _delta_log and 3 snappy.parquet files. I am trying to read all those .parquet using spark.read.format('delta').load(path) but I am getting data from only one same file all the time. Can't I read from all these files? If s...

Data Engineering

3159 Views
5 replies
4 kudos

01-24-2022 2:00:16 PM

View Replies

Latest Reply

KKo
Contributor III

02-24-2022 5:42:16 AM

4 kudos

@Werner Stinckens Thanks for the reply and explanation, that was helpful to understand the delta feature.

4 kudos

02-24-2022 5:42:16 AM

4 More Replies

by StephanieRivera • Valued Contributor II

10-13-2021 2:55:42 PM

977 Views
1 replies
5 kudos

Resolved! Are there data types that are not good in Delta format? Does Delta handle images, audio, and video?

Data Engineering

977 Views
1 replies
5 kudos

10-13-2021 2:55:42 PM

View Replies

Latest Reply

Hubert-Dudek
Esteemed Contributor III

10-14-2021 1:08:02 AM

5 kudos

Hi as it is transaction tables (there are history commits and snapshot). I would not store there images or videos as it can be saved few times and you will have high storage costs, it can also be slow when data is big.I would definitely store images,...

5 kudos

10-14-2021 1:08:02 AM

by User16783853501 • New Contributor II

06-23-2021 2:36:44 PM

666 Views
1 replies
1 kudos

Converting data that is in Delta format to plain parquet format

Many a times there is a need to convert Delta tables from Delta format to plain parquet format for a number of reasons, what is the best way to do that?

Data Engineering

666 Views
1 replies
1 kudos

06-23-2021 2:36:44 PM

View Replies

Latest Reply

User16826994223
Honored Contributor III

06-23-2021 6:11:30 PM

1 kudos

You can easily convert a Delta table back to a Parquet table using the following steps:If you have performed Delta Lake operations that can change the data files (for example, delete or merge, run vacuum with retention of 0 hours to delete all data f...

1 kudos

06-23-2021 6:11:30 PM