cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Databricks Python Import Bug

alvaro_databric
New Contributor III

Hello,

I am reaching to the community in order to shed some light into a Bug I have been encountering recently. The two setups are as follows:

SETUP-1 (WORKS):

  • Python Notebook in Workspace FileSystem (this is Shared/folder/...)
  • Python custom wheel library in .whl installed into the cluster

SETUP-2 (DOES NOT WORK)

  • Python Notebook in Repos
  • Same wheel and cluster as SETUP-1

Moreover SETUP-2 is able to improt some of the functions but not all of them. At first I thought it was an issue with the Wheel generation, but SETUP-1 works just fine and is able to improt everything.

This issue makes me think that there might be a difference into how Databricks manages filesystem or any other variables in Repos that I am not grasping.

Can someone indicate me what could be the issue here or any important difference when working with databricks Repos?

1 ACCEPTED SOLUTION

Accepted Solutions

alvaro_databric
New Contributor III

Solution has comed with an update, as stated in Work with Python and R modules | Databricks on AWSIn Databricks Runtime 13.0 and above, directories added to the Python sys.path are automatically distributed to all executors in the cluster. In Databricks Runtime 12.2 LTS and below, libraries added to the sys.path must be explicitly installed on executors.

This seems to have solved our strange import problem from Databricks Repos

View solution in original post

3 REPLIES 3

Debayan
Databricks Employee
Databricks Employee

Hi,

How can I run non-Databricks notebook files in a repo?

For example, a .py file?

You can use any of the following:

You can refer to: https://docs.databricks.com/repos/limits.html and https://docs.databricks.com/repos/limits.html#non-notebook-files-files-in-repos

Also, refer to the errors: https://docs.databricks.com/repos/errors-troubleshooting.html

Please let us know if this helps and let us know the errors. Also, please tag @Debayanโ€‹ with your next comment so that I will get notified. Thank you!

Anonymous
Not applicable

Hi @Alvaro Moureโ€‹ 

Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. 

We'd love to hear from you.

Thanks!

alvaro_databric
New Contributor III

Solution has comed with an update, as stated in Work with Python and R modules | Databricks on AWSIn Databricks Runtime 13.0 and above, directories added to the Python sys.path are automatically distributed to all executors in the cluster. In Databricks Runtime 12.2 LTS and below, libraries added to the sys.path must be explicitly installed on executors.

This seems to have solved our strange import problem from Databricks Repos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group