So I'm the designated data engineer for a proof of concept we're running, I'm working with one infrastructure guy who's setting up everything in Terraform (company policy). He's got the setup down for Databricks so we can configure clusters and run notebooks for reading/writing to ADLS Gen2 storage. However, I'm unable to run Workflows, Jobs time out after 10+ retries and DLT pipelines seems to run forever until I stop it.
I suspect the two issues are related. Do you have any tips on how to move forward? Could it be firewall? Some infrastructure improperly set up? On my own subscription without limitations I basically just use the portal to create workspace in 2 minutes and everything works fine. Here are the error message from the FIRST cell in the Job where it times out (yes it times out on the very first cell): log4j first, standard output second
https://gist.github.com/espenol/416609b4550bc682d375b03bb5d0619c
https://gist.github.com/espenol/b56b2ce08e5e2ed463141764ceeb5ef5