cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Increase stack size Databricks

tgen
New Contributor II

Hi everyone

I'm currently running a shell script in a notebook, and I'm encountering a segmentation fault. This is due to the stack size limitation. I'd like to increase the stack size using ulimit -s unlimited, but I'm facing issues with setting this limit in the notebook environment.

I am using:

2-12 Workers 256-1,536 GB Memory 64-384 Cores
1 Driver256 GB Memory, 64 Cores
Runtime15.2.x-scala2.12

Could anyone provide guidance on how to properly increase the stack size for my shell script using Notebooks in Databricks? Any tips or alternative solutions to avoid the segmentation fault would also be greatly appreciated.

2 REPLIES 2

Kaniz_Fatma
Community Manager
Community Manager

Hi @tgen, To increase the stack size for your shell script in Databricks Notebooks, follow these steps:

  1. Spark Configuration Property: With Databricks Runtime 12.2 LTS and above, you can increase the stack...1. This property controls the maximum output length for the REPL (Read-Eval-Print Loop) in the notebook environment.

  2. Setting the Configuration: In your Databricks Notebook, navigate to the “Advanced Options” section. Add the following configuration:

    spark.databricks.driver.maxReplOutputLength <desired_value>
    

    Replace <desired_value> with the desired stack size limit (e.g., unlimited).

  3. Restart the Notebook: After making this change, restart your notebook to apply the new configuration.

If you encounter any issues or need further assistance, feel free to ask! 😊

 

tgen
New Contributor II

Hi @Kaniz_Fatma ,

Thanks for your response. I tried this and unfortunately I could not get it to work.

When I set spark.databricks.driver.maxReplOutputLength to unlimited in the cluster configurations, I got this error message when running in the Notebook: Failure starting repl. Try detaching and re-attaching the notebook. I tried detaching and re-attaching the cluster and continued to get the same message. Looking into it more, it looks like it has to be set to an integer value. I also tried this on the web terminal and I continued to get the segmentation fault error.

Next, I tried setting spark.databricks.driver.maxReplOutputLength to a very high number (e.g. 500000000) and received the same segmentation fault error when running it in the Notebook and web terminal.

Do you have any other ideas of things I could try?

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group