cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to access a jar file stored in Databricks Workspace ?

ranged_coop
Valued Contributor II

Hi All,

We have a couple of jars stored in a workspace folder.

We are using init scripts to copy the jars in the workspace to the /databricks/jars path.

The init scripts do not seem to be able to find the files.

The scripts are failing saying the files could not be found.

```bash

#!/bin/bash

cp /Workspace/jars/file_name.jar /databricks/jars/

cp /Workspace/jars/file_name.jar /databricks/databricks-hive/

```

Can you please let me know if this is even possible ?

What is the correct path for a file in a Workspace Folder.

Thank you...

Edit:

  1. Have tried file path with and without /Workspace - both are failing saying file is not available.
  2. I have also tried sleep for up to 2 minutes and the files are still not available.

Would be nice if someone from databricks can confirm if binary files from Workspace such as jars are accessible via init scripts, if yes what would their path be like ?

21 REPLIES 21

-werners-
Esteemed Contributor III

perhaps using tree -d? (probably you have to install it first)

or there is also something as the web terminal but I have never used that

ranged_coop
Valued Contributor II

Thank you for your response @Kaniz Fatma​ 

The issue is still not resolved.

I was hoping someone from Databricks would be able to help. Please help if possible.

Steps to Reproduce:

  1. Have two folders in Workspace - init_scripts and jars.
  2. Have any jar file in the jars folder.
  3. Have an init script in the init_scripts folder that copies the jar file from the jars folder into the /databricks/databricks-hive/ folder in the cluster.

```bash

#!/bin/bash

cp /Workspace/jars/file_name.jar /databricks/jars/

cp /Workspace/jars/file_name.jar /databricks/databricks-hive/

```

  1. Configure the init script in the workspace in the Cluster configuration.

Expected:

  1. Cluster needs to start
  2. The jar has to be copied to the /databricks/databricks-hive/ folder.

Current Behaviour:

  1. The cluster is not starting.
  2. Init script is failing stating that the source file is not available.

Anonymous
Not applicable

Hi @Bharath Kumar Ramachandran​ 

We haven't heard from you since the last response from @Werner Stinckens​  . Kindly share the information with us, and in return, we will provide you with the necessary solution.

 Thanks and Regards

ranged_coop
Valued Contributor II

Hi @Vidula Khanna​ , thank you for your response. I had also responded to a similar message from @Kaniz Fatma​ . Anyways the issue is still not resolved.

The issue is still not resolved.

I was hoping someone from Databricks would be able to help. Please help if possible.

Steps to Reproduce:

  1. Have two folders in Workspace - init_scripts and jars.
  2. Have any jar file in the jars folder.
  3. Have an init script in the init_scripts folder that copies the jar file from the jars folder into the /databricks/databricks-hive/ folder in the cluster.

```bash

#!/bin/bash

cp /Workspace/jars/file_name.jar /databricks/jars/

cp /Workspace/jars/file_name.jar /databricks/databricks-hive/

```

  1. Configure the init script in the workspace in the Cluster configuration.

Expected:

  1. Cluster needs to start without any error.
  2. The jar has to be copied to the /databricks/databricks-hive/ folder.

Current Behaviour:

  1. The cluster is not starting.
  2. Init script is failing stating that the source file is not available.

Things already tried:

  1. Have tried file path with and without /Workspace - both are failing saying file is not available.
  2. I have also tried sleep for up to 2 minutes and the files are still not available.
  3. Have tried an init script that would just list the files in the path and print to a file. Even listing fails stating file/path is not available.

-werners-
Esteemed Contributor III

Someone has the same issue as you and well, it is not possible:

https://community.databricks.com/s/question/0D78Y000007uNLYSA2/detail

ranged_coop
Valued Contributor II

Thank you for sharing the link. It was useful.

It is a little sad seeing that it is not possible having spent so much time analyzing and trying out various options.

I hope this is from a valid source. If so, I hope Databricks would consider adding this option seeing that many (well atleast 2 :)) are expecting this feature. Using CLI and API would just complicate things and not that practical.

Anonymous
Not applicable

Hi @Bharath Kumar Ramachandran​ 

You're welcome! I'm glad you found the link useful. I empathize with your hope that Databricks would consider adding this option. It's possible that Databricks will take user feedback into account when planning future updates and enhancements.

Moreover, every best answer marked contributes to the growth and success of our community.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group