cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Increase stack size Databricks

tgen
New Contributor II

Hi everyone

I'm currently running a shell script in a notebook, and I'm encountering a segmentation fault. This is due to the stack size limitation. I'd like to increase the stack size using ulimit -s unlimited, but I'm facing issues with setting this limit in the notebook environment.

I am using:

2-12 Workers 256-1,536 GB Memory 64-384 Cores
1 Driver256 GB Memory, 64 Cores
Runtime15.2.x-scala2.12

Could anyone provide guidance on how to properly increase the stack size for my shell script using Notebooks in Databricks? Any tips or alternative solutions to avoid the segmentation fault would also be greatly appreciated.

2 REPLIES 2

Kaniz_Fatma
Community Manager
Community Manager

Hi @tgen, To increase the stack size for your shell script in Databricks Notebooks, follow these steps:

  1. Spark Configuration Property: With Databricks Runtime 12.2 LTS and above, you can increase the stack...1. This property controls the maximum output length for the REPL (Read-Eval-Print Loop) in the notebook environment.

  2. Setting the Configuration: In your Databricks Notebook, navigate to the “Advanced Options” section. Add the following configuration:

    spark.databricks.driver.maxReplOutputLength <desired_value>
    

    Replace <desired_value> with the desired stack size limit (e.g., unlimited).

  3. Restart the Notebook: After making this change, restart your notebook to apply the new configuration.

If you encounter any issues or need further assistance, feel free to ask! 😊

 

tgen
New Contributor II

Hi @Kaniz_Fatma ,

Thanks for your response. I tried this and unfortunately I could not get it to work.

When I set spark.databricks.driver.maxReplOutputLength to unlimited in the cluster configurations, I got this error message when running in the Notebook: Failure starting repl. Try detaching and re-attaching the notebook. I tried detaching and re-attaching the cluster and continued to get the same message. Looking into it more, it looks like it has to be set to an integer value. I also tried this on the web terminal and I continued to get the segmentation fault error.

Next, I tried setting spark.databricks.driver.maxReplOutputLength to a very high number (e.g. 500000000) and received the same segmentation fault error when running it in the Notebook and web terminal.

Do you have any other ideas of things I could try?

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!