cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Question Submitted How to tune a job to avoid paying extra cost for EXPAND DISK? Is it due to the shuffle or data skew? Is there a way to configure the workers with larger disk? If not having EXPAND DISK, it will fail since no space left on the disk.

Wayne
New Contributor III
 
2 REPLIES 2

Anonymous
Not applicable

Your worker disk size should almost never matter. You should be using cloud storage such as S3/ADL2/GCS. What operations are you running and what error message are you getting?

Wayne
New Contributor III

No error, just seeing the EXPAND DISK in cluster event logs. This is just a regular spark application. I am not sure if the cloud storage matters - a spark application uses it as input and output.