cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Limitation on size of init script

Rahul2025
New Contributor III

Hi,

We're using Databricks Runtime version 11.3LTS and executing a Spark Java Job using a Job Cluster. To automate the execution of this job, we need to define (source in from bash config files) some environment variables through an init script (cluster-scoped) and make them available to the Spark Java job.

While doing this, we realized that we're not able to upload an init script of size larger than 5KB at an appropriate DBFS location. Apparently the documentation states that the init script size should not be larger than 64KB. Are there are any settings/configurations at workspace level that can help us raise this limit from 5KB to 64KB?

Please let us know if we're missing anything here. Any help in this regard will be highly appreciated.

Thanks in advance!

Regards,

// Rahul

11 REPLIES 11

Lakshay
Esteemed Contributor
Esteemed Contributor

Hi @Rahul K​ , Could you please share the screenshot of the error you are getting?

Rahul2025
New Contributor III

Thank you @Lakshay Goel​ for your response and really sorry for the delayed response. I'm not able to attach a screenshot here.

Please find below the error I'm getting while uploading a file named db_init.sh to /databricks/init folder.

Error occurred when processing file db_init.sh: Server responded with 0 code.

karthik_p
Esteemed Contributor

@Rahul K​ it looks that's limitation

The init script cannot be larger than 64KB. If a script exceeds that size, an error message appears when you try to save.

as on DBFS external mounts init scripts cannot bet stored. @Lakshay Goel​ any other inputs please

Rahul2025
New Contributor III

Thank you @karthik p​ for your response. I'm trying to upload a script named db_init.sh (of 10kb size) to /databricks/init folder and getting the following error -

Error occurred when processing file db_init.sh: Server responded with 0 code.

Had it been having size greater than 64kb, then it's expected, but it's having a size of 10kb. Please let me know if there are any settings that help increase this size.

@Rahul K​ - Databricks recommends you avoid storing init scripts under /databricks/init (which is now a legacy) to avoid unexpected behaviour. Try using the new Global Init Scripts using either the UI or API or Terraform and see if the issue persists.

Reference: https://learn.microsoft.com/en-us/azure/databricks/clusters/init-scripts

Rahul2025
New Contributor III

Thank you @Sundar Raman​ for your response. Yes, we referred the same link that you shared and we currently use the Cluster-scoped script (following option #4) as against Global (following option #3) because the script needs to be executed only for one cluster and not all clusters that're part of a workspace. We're not using the following options #1 and #2 as they've been deprecated.

  1. Legacy global (deprecated)
  2. Cluster-named (deprecated)
  3. Global (new)
  4. Cluster-scoped

This clsuter-scoped init script has been provided from the UI and uploaded/deployed to the DBFS root. It has been uploaded/deployed to the DBFS root as described in the shared link (Section - Cluster-scoped init script locations). I'll check if referring it from ADLS directly helps address the size limitatation.

oh ok! A quick question @Rahul K​ ! Are you still using /databricks/init as it relates to the legacy global path? Have you tried saving the cluster-scoped script to a different location such as /databricks/scripts ?

Rahul2025
New Contributor III

Yes @Sundar Raman​, but that also gives the same error.

@Rahul K Also please can you confirm the location is not a DBFS mount? Have you tried on any other DBR and do you still have the same issue there? ​

Rahul2025
New Contributor III

yes @Sundar Raman​, the location is not a DBFS mount. We tried this in a workspace where DBR 10.4LTS and 11.3LTS are being used.

Anonymous
Not applicable

Hi @Rahul K​ 

Thank you for posting your question in our community! We are happy to assist you.

To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your question?

This will also help other community members who may have similar questions in the future. Thank you for your participation and let us know if you need any further assistance! 

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.