Heya 🙂
So I'm working on a new workflow. I've started by writing a notebook and running it on an interactive cluster with "Single User" access mode, and everything worked fine.
I created a workflow for this task with the same interactive cluster, and everything worked fine.
The issue started when I changed the cluster to be a Job Compute cluster instead of interactive. I used the same configuration on both of the clusters, but when I started running it on job compute, the application got stuck.
Some more information: I'm reading data from snowflake and then I process it. When chaining then cluster to a job compute cluster, I saw that it didnt finish for a long time (when it would've finish on the interactive cluster), I checked the spark UI and I didnt see any jobs. I checked the snowflake query history, and saw that it did get the query from spark and that it finished it.
After hours of searching I found this thread on reddit:
https://www.reddit.com/r/databricks/comments/1d4tcyy/job_cluster_stuck/
It said to switch it to "Shared no isolation" access mode. I did that and it did the trick. However, I need to use Unity Catalog, and this access mode doesnt support it. I tried to switch to "Shared" access mode, but the program threw an exception when trying to access s3 api (I have an instance profile on the cluster with the sufficient permissions, and it did work on single user cluster).
Other than that - I searched the logs of the driver & executors and didnt find anything. The only thing I "found" is that on the interactive cluster, the driver has logs of the data that it got from snowflake, and in the driver on the Job Compute cluster I didnt see that log.
So any idea how to solve that? Or rather a different solution that will allow me to use a job compute cluster with unity catalog without this problem?
Thanks!