Lazy evaluation in serverless vs all purpose compute ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-14-2025 09:29 PM
As you can see right now I am connected to serverless compute and when I give wrong path, spark does lazy evaluation and gives error on display.
However, when I switch from serverless to my all purpose cluster I get the error when I create the df itself.
Why is that?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-15-2025 06:01 AM - edited 04-15-2025 06:09 AM
Hi @aniket07
With Serverless compute, Spark uses lazy evaluation and only checks if the path exists when you perform an action (like display()), so the error appears then. On the other hand, in All-Purpose clusters, Spark checks the path immediately when you create the DataFrame, so you see the error right away.
This difference is due to how each environment handles path validation and when they access storage.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-15-2025 11:01 AM
Based on the scenario, what https://community.databricks.com/t5/user/viewprofilepage/user-id/156441 saying is correct though the eager evaluation property is false in both cases and for All-Purpose clusters, Spark is checking the path immediately when you create the DataFrame.