Data Engineering

Forum Posts

Sorted by:

by Data_Engineer3 • Contributor III

04-02-2023 9:20:18 AM

4888 Views
5 replies
0 kudos

Default maximum spark streaming chunk size in delta files in each batch?

working with delta files spark structure streaming , what is the maximum default chunk size in each batch?How do identify this type of spark configuration in databricks?#[Databricks SQL] #[Spark streaming] #[Spark structured streaming] #Spark

Data Engineering

4888 Views
5 replies
0 kudos

04-02-2023 9:20:18 AM

View Replies

Latest Reply

NandiniN
Databricks Employee

10-31-2024 3:02:59 AM

0 kudos

doc - https://docs.databricks.com/en/structured-streaming/delta-lake.html Also, what is the challenge while using foreachbatch?

0 kudos

10-31-2024 3:02:59 AM

4 More Replies

by thushar • Contributor

03-08-2023 11:57:42 PM

4975 Views
4 replies
0 kudos

Delta file partitions

Have one function to create files with partitions, in that the partitions are created based on metadata (getPartitionColumns) that we are keeping. In a table we have two columns that are mentioned as partition columns, say 'Team' and 'Speciality'. Wh...

Data Engineering

4975 Views
4 replies
0 kudos

03-08-2023 11:57:42 PM

View Replies

Latest Reply

Anonymous
Not applicable

03-31-2023 5:52:51 PM

0 kudos

Hi @Thushar R Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we ...

0 kudos

03-31-2023 5:52:51 PM

3 More Replies

by mickniz • Contributor

11-24-2022 11:30:37 PM

5534 Views
6 replies
10 kudos

What is the best way to take care of Drop and Rename a column in Schema evaluation.

I would need some suggestion from DataBricks Folks. As per documentation in Schema Evaluation for Drop and Rename Data is overwritten. Does it means we loose data (because I read data is not deleted but kind of staged). Is it possible to query old da...

Data Engineering

5534 Views
6 replies
10 kudos

11-24-2022 11:30:37 PM

View Replies

Latest Reply

SS2
Valued Contributor

11-29-2022 11:31:31 AM

10 kudos

Overwritte option will overwritte your data. If you want to change column name then you can first alter the delta table as per your need then you can append new data as well. So both problems you can resolve

10 kudos

11-29-2022 11:31:31 AM

5 More Replies

by Justine_Bieber • New Contributor III

09-22-2022 4:54:51 AM

5125 Views
5 replies
8 kudos

Resolved! Change schema when writing to the Delta format

Is it possible to reapply schema in delta files? For example, we have a history with field string but from some point, we need to replace string with struct.In my case merge option and overwrite schema don't work.

Data Engineering

5125 Views
5 replies
8 kudos

09-22-2022 4:54:51 AM

View Replies

Latest Reply

Justine_Bieber
New Contributor III

10-09-2022 8:06:12 AM

8 kudos

Hi guys! Definitely, thank you for your support.

8 kudos

10-09-2022 8:06:12 AM

4 More Replies

by a2_ish • New Contributor II

10-04-2022 3:20:18 AM

3114 Views
2 replies
2 kudos

How to write the delta files for managed table? how can I define the sink

I have tried below code to write data in a delta table and save the delta files in a sink. I tried using azure storage as sink but I get error as not enough access, I can confirm that I have enough access to azure storage, however I can run the below...

Data Engineering

3114 Views
2 replies
2 kudos

10-04-2022 3:20:18 AM

View Replies

Latest Reply

Anonymous
Not applicable

11-02-2022 9:10:27 PM

2 kudos

Hi @Ankit Kumar Does @Hubert Dudek response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!

2 kudos

11-02-2022 9:10:27 PM

1 More Replies

by HarshaK • New Contributor III

04-07-2022 1:43:53 AM

17653 Views
4 replies
6 kudos

Resolved! Partition By () on Delta Files

Hi All,I am trying to Partition By () on Delta file in pyspark language and using command:df.write.format("delta").mode("overwrite").option("overwriteSchema","true").partitionBy("Partition Column").save("Partition file path") -- It doesnt seems to w...

Data Engineering

17653 Views
4 replies
6 kudos

04-07-2022 1:43:53 AM

View Replies

Latest Reply

Anonymous
Not applicable

04-26-2022 9:36:14 AM

6 kudos

Hey @Harsha kriplani Hope you are well. Thank you for posting in here. It is awesome that you found a solution. Would you like to mark Hubert's answer as best? It would be really helpful for the other members too.Cheers!

6 kudos

04-26-2022 9:36:14 AM

3 More Replies

Databricks Community

Default maximum spark streaming chunk size in delta files in each batch?

Delta file partitions

What is the best way to take care of Drop and Rename a column in Schema evaluation.

Resolved! Change schema when writing to the Delta format

How to write the delta files for managed table? how can I define the sink

Resolved! Partition By () on Delta Files