cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

ashu_aith1991
by New Contributor II
  • 921 Views
  • 1 replies
  • 3 kudos

delta table

can we connect delta table of databricks from one workspace to another in different subscription and run vacuum command?

  • 921 Views
  • 1 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @ASHUTOSH YADAV​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

  • 3 kudos
JordanYaker
by Contributor
  • 9681 Views
  • 3 replies
  • 6 kudos

Has anyone else seen state files disappear in low-volume delta tables?

I have some Delta tables in our dev environment that started popping up with the following error today:py4j.protocol.Py4JJavaError: An error occurred while calling o670.execute. : org.apache.spark.SparkException: Job aborted due to stage failure: Tas...

  • 9681 Views
  • 3 replies
  • 6 kudos
Latest Reply
Anonymous
Not applicable
  • 6 kudos

Hi @Jordan Yaker​ We haven't heard from you since the last response from @Kaniz Fatma​ , and I was checking back to see if her suggestions helped you.Or else, If you have any solution, please share it with the community, as it can be helpful to other...

  • 6 kudos
2 More Replies
Nis
by New Contributor II
  • 1513 Views
  • 1 replies
  • 2 kudos

Best sequence of using Vacuum, optimize, fsck repair and refresh commands.

I have a delta table whose size will increases gradually now we have around 1.5 crores of rows while running vacuum command on that table i am getting the below error.ERROR: Job aborted due to stage failure: Task 7 in stage 491.0 failed 4 times, most...

  • 1513 Views
  • 1 replies
  • 2 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 2 kudos

Do you have access to the Executor 7 logs? is there a high GC or some other events that is making the heartbeat timeout? would you be able to check the failed stages?

  • 2 kudos
AP
by New Contributor III
  • 4028 Views
  • 5 replies
  • 3 kudos

Resolved! AutoOptimize, OPTIMIZE command and Vacuum command : Order, production implementation best practices

So databricks gives us great toolkit in the form optimization and vacuum. But, in terms of operationaling them, I am really confused on the best practice.Should we enable "optimized writes" by setting the following at a workspace level?spark.conf.set...

  • 4028 Views
  • 5 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

@AKSHAY PALLERLA​ Just checking in to see if you got a solution to the issue you shared above. Let us know!Thanks to @Werner Stinckens​ for jumping in, as always!

  • 3 kudos
4 More Replies
brickster_2018
by Databricks Employee
  • 1312 Views
  • 1 replies
  • 0 kudos

Resolved! How to track the progress of a VACUUM command.

My VACCUM command is stuck. I am not sure if it's deleting any files. 

  • 1312 Views
  • 1 replies
  • 0 kudos
Latest Reply
brickster_2018
Databricks Employee
  • 0 kudos

There is no direct way to track the progress of the VACUUM command. One easy workaround is to run a DRY RUN from another notebook which will give the estimate of files to be deleted at that point in time. This will give a rough estimate of files to b...

  • 0 kudos
brickster_2018
by Databricks Employee
  • 1386 Views
  • 1 replies
  • 0 kudos

Resolved! Can I give partition filter conditions for the VACUUM command similar to OPTIMIZE

For the optimize command, I can give predicates and it's easy to optimize the partitions where the data is added. Similarly, can I specify the "WHERE" clause on the partition for a VACUUM command

  • 1386 Views
  • 1 replies
  • 0 kudos
Latest Reply
brickster_2018
Databricks Employee
  • 0 kudos

It's by design, VACUUM command does not support filters on the partition columns. This is because removing the old files partially can leave can impact the time travel feature. 

  • 0 kudos
Labels