cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Aviral-Bhardwaj
by Esteemed Contributor III
  • 8011 Views
  • 2 replies
  • 2 kudos

Resolved! can anyone help with Spill Question

Spill occurs as a result of executing various wide transformations. However, diagnosing a spill requires one to proactively look for key indicators.Where in the Spark UI are two of the primary indicators that a partition is spilling to disk?a-   Exec...

  • 8011 Views
  • 2 replies
  • 2 kudos
Latest Reply
pvignesh92
Honored Contributor
  • 2 kudos

@Aviral Bhardwaj​  I feel it is Option e. Stage and executor log files. Consolidated details at the Stage LevelDetails at the task and Executor Level Please let me know if you feel any other option is better.

  • 2 kudos
1 More Replies
Atacama
by New Contributor II
  • 2321 Views
  • 3 replies
  • 1 kudos
  • 2321 Views
  • 3 replies
  • 1 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 1 kudos

the spilled data is written to some object store on the cloud provider.I believe all of them apply encryption by default.Of course it is up to you (or your colleagues) to restrict access to the storage.​

  • 1 kudos
2 More Replies
user_b22ce5eeAl
by New Contributor II
  • 1505 Views
  • 2 replies
  • 0 kudos

pandas udf type grouped map fails

Hello, I am trying to get the shap values for my whole dataset using pandas udf for each category of a categorical variable. It runs well when I run it on a few categories but when I want to run the function on the whole dataset my job fails. I see ...

  • 1505 Views
  • 2 replies
  • 0 kudos
Latest Reply
Jackson
New Contributor II
  • 0 kudos

I want to use data.groupby.apply() to apply a function to each row of my Pyspark Dataframe per group.I used The Grouped Map Pandas UDFs. However I can't figure out how to add another argument to my function. DGCustomerFirst SurveyI tried using the ar...

  • 0 kudos
1 More Replies
Labels