cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Get Started Discussions
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

How to detect if running in a workflow job?

dollyb
New Contributor III

Hi there,

what's the best way to differentiate in what environment my Spark session is running? Locally I develop with databricks-connect's DatabricksSession, but that doesn't work when running a workflow job which requires SparkSession.getOrCreate(). Right now in the job I'm passing a parameter that the app is reading. Is there another robust way to detected if the app is running on a Databricks cluster or not?

 

1 ACCEPTED SOLUTION

Accepted Solutions

Kaniz
Community Manager
Community Manager

Hi @dollybWhen distinguishing between environments where your Spark session is running, especially when transitioning from local development to a workflow job, itโ€™s essential to ensure robust detection.

Here are some approaches you can consider:

  1. Cluster Context Using Notebook Context:

  2. Spark Session Isolation:

  3. Check for Streaming Tab in Spark UI:

  4. Environment-Specific Configuration Parameters:

    • Consider using environment-specific configuration parameters.
    • For example, pass a parameter (as youโ€™re currently doing) that indicates whether the app is running on a Databricks cluster or not.
    • This approach provides flexibility and allows you to adapt to different environments.

Remember that the choice depends on your specific use case and requirements. By combining these methods, you can create a robust mechanism to detect whether your Spark session is running in a Databricks cluster or elsewhere. ๐Ÿš€

 

View solution in original post

2 REPLIES 2

Kaniz
Community Manager
Community Manager

Hi @dollybWhen distinguishing between environments where your Spark session is running, especially when transitioning from local development to a workflow job, itโ€™s essential to ensure robust detection.

Here are some approaches you can consider:

  1. Cluster Context Using Notebook Context:

  2. Spark Session Isolation:

  3. Check for Streaming Tab in Spark UI:

  4. Environment-Specific Configuration Parameters:

    • Consider using environment-specific configuration parameters.
    • For example, pass a parameter (as youโ€™re currently doing) that indicates whether the app is running on a Databricks cluster or not.
    • This approach provides flexibility and allows you to adapt to different environments.

Remember that the choice depends on your specific use case and requirements. By combining these methods, you can create a robust mechanism to detect whether your Spark session is running in a Databricks cluster or elsewhere. ๐Ÿš€

 

dollyb
New Contributor III

Thanks, dbutils.notebook.getContext does indeed contain information about the job run.