cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

How to access a jar file stored in Databricks Workspace ?

ranged_coop
Valued Contributor II

Hi All,

We have a couple of jars stored in a workspace folder.

We are using init scripts to copy the jars in the workspace to the /databricks/jars path.

The init scripts do not seem to be able to find the files.

The scripts are failing saying the files could not be found.

```bash

#!/bin/bash

cp /Workspace/jars/file_name.jar /databricks/jars/

cp /Workspace/jars/file_name.jar /databricks/databricks-hive/

```

Can you please let me know if this is even possible ?

What is the correct path for a file in a Workspace Folder.

Thank you...

Edit:

  1. Have tried file path with and without /Workspace - both are failing saying file is not available.
  2. I have also tried sleep for up to 2 minutes and the files are still not available.

Would be nice if someone from databricks can confirm if binary files from Workspace such as jars are accessible via init scripts, if yes what would their path be like ?

23 REPLIES 23

-werners-
Esteemed Contributor III

perhaps using tree -d? (probably you have to install it first)

or there is also something as the web terminal but I have never used that

Hi @Bharath Kumar Ramachandranโ€‹, We haven't heard from you since the last response from @Werner Stinckensโ€‹, and I was checking back to see if his suggestions helped you.

Or else, If you have any solution, please share it with the community, as it can be helpful to others.

Also, please don't forget the "Select As Best" button whenever the information provided helps resolve your question.

ranged_coop
Valued Contributor II

Thank you for your response @Kaniz Fatmaโ€‹ 

The issue is still not resolved.

I was hoping someone from Databricks would be able to help. Please help if possible.

โ€‹

Steps to Reproduce:

  1. Have two folders in Workspace - init_scripts and jars.
  2. Have any jar file in the jars folder.
  3. Have an init script in the init_scripts folder that copies the jar file from the jars folder into the /databricks/databricks-hive/ folder in the cluster.

```bash

#!/bin/bash

cp /Workspace/jars/file_name.jar /databricks/jars/

cp /Workspace/jars/file_name.jar /databricks/databricks-hive/

```

  1. Configure the init script in the workspace in the Cluster configuration.

Expected:

  1. Cluster needs to start
  2. The jar has to be copied to the /databricks/databricks-hive/ folder.

Current Behaviour:

  1. The cluster is not starting.
  2. Init script is failing stating that the source file is not available.

Anonymous
Not applicable

Hi @Bharath Kumar Ramachandranโ€‹ 

We haven't heard from you since the last response from @Werner Stinckensโ€‹  . Kindly share the information with us, and in return, we will provide you with the necessary solution.

 Thanks and Regards

ranged_coop
Valued Contributor II

Hi @Vidula Khannaโ€‹ , thank you for your response. I had also responded to a similar message from @Kaniz Fatmaโ€‹ . Anyways the issue is still not resolved.

โ€‹

The issue is still not resolved.

I was hoping someone from Databricks would be able to help. Please help if possible.

โ€‹

Steps to Reproduce:

  1. Have two folders in Workspace - init_scripts and jars.
  2. Have any jar file in the jars folder.
  3. Have an init script in the init_scripts folder that copies the jar file from the jars folder into the /databricks/databricks-hive/ folder in the cluster.

```bash

#!/bin/bash

cp /Workspace/jars/file_name.jar /databricks/jars/

cp /Workspace/jars/file_name.jar /databricks/databricks-hive/

```

  1. Configure the init script in the workspace in the Cluster configuration.

Expected:

  1. Cluster needs to start without any error.
  2. The jar has to be copied to the /databricks/databricks-hive/ folder.

Current Behaviour:

  1. The cluster is not starting.
  2. Init script is failing stating that the source file is not available.

โ€‹

Things already tried:

  1. Have tried file path with and without /Workspace - both are failing saying file is not available.
  2. I have also tried sleep for up to 2 minutes and the files are still not available.
  3. Have tried an init script that would just list the files in the path and print to a file. Even listing fails stating file/path is not available.

โ€‹

-werners-
Esteemed Contributor III

Someone has the same issue as you and well, it is not possible:

https://community.databricks.com/s/question/0D78Y000007uNLYSA2/detail

ranged_coop
Valued Contributor II

Thank you for sharing the link. It was useful.

It is a little sad seeing that it is not possible having spent so much time analyzing and trying out various options.

I hope this is from a valid source. If so, I hope Databricks would consider adding this option seeing that many (well atleast 2 :)) are expecting this feature. Using CLI and API would just complicate things and not that practical.

Hi @Bharath Kumar Ramachandranโ€‹, If you want a new feature to be added, you can request the feature here at this link:-https://docs.databricks.com/resources/ideas.html

Anonymous
Not applicable

Hi @Bharath Kumar Ramachandranโ€‹ 

You're welcome! I'm glad you found the link useful. I empathize with your hope that Databricks would consider adding this option. It's possible that Databricks will take user feedback into account when planning future updates and enhancements.

Moreover, every best answer marked contributes to the growth and success of our community.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group