09-04-2015 12:18 AM
for example I have one.py and two.py in databricks and I want to use one of the module from one.py in two.py. Usually I do this in my local machine by import statement like below
two.py__
from one import module1
.
.
.
How to do this in databricks???
12-30-2019 03:20 AM
05-03-2019 12:13 AM
Is it possible to import a particular function using %run statement instead of running the whole notebook?
07-18-2019 07:55 AM
Databricks is a very rigid environment. They don't promote modularizing code. I don't understand why they disable / dissuade use of such basic concepts which are so generic to all programming languages.
Even after so many years this is still a problem
11-25-2020 12:06 PM
exactly. I don't understand what is the path of the current notebook if I do ssh
07-18-2019 03:18 PM
The way to solve this problem is to add the path of your code to the system, then proceed to import modules selectively or all modules in a file.
You can download an example notebook from here https://github.com/javierorozco/databricks_import_python_module
import sys// Add the path to system, local or mounted S3 bucket, e.g. /dbfs/mnt/<path_to_bucket> sys.path.append('/databricks/driver/') sys.path.append('/databricks/driver/databricks_import_python_module/') sys.path.append('/databricks/driver/databricks_import_python_module/test.py')
12-24-2019 08:49 PM
@javier.orozco@realeuesit.com, I was able to work with the file created in your repo (test.py) but my with my own modules I am getting error. Anything I am missing?
05-13-2020 06:27 PM
Works for me to upload my python file to dbfs using the databricks CLI:
dbfs cp mymodule.py dbfs:/path/to/module/mymodule.py --overwriteThen the following works:
import sys
sys.path.append('/dbfs/path/to/module')
#the file is /dbfs/path/to/module/mymodule.py
import mymodule11-28-2020 02:13 AM
It's a nice hack but how do I connect to the cluster driver to do real remote ssh development. I could connect via ssh to the driver but it seems there is a different python there which has no pyspark
11-28-2020 02:11 AM
This is very cumbersome for someone who is used to develop data science projects with modules packages and classes and not just notebooks. Why does not data bricks allow this? I know about databricks-connect but it does not solve the problem as the driver runs locally and not remotely. What I want is a real ssh remote development experience.
10-11-2021 09:43 AM
USE REPOS! 😁
Repos is able to call a function that is in a file in the same Github repo as long as Files is enabled in the admin panel.
So if I have utils.py with:
import pandas as pd
 
def clean_data():
  # Load wine data
  data = pd.read_csv("/dbfs/databricks-datasets/wine-quality/winequality-white.csv", sep=";")
  print(data)
  # Remove spaces from column names
  data.rename(columns=lambda x: x.replace(' ', '_'), inplace=True)`my notebook can call above with this:
import utils
 
utils.clean_data()01-22-2025 08:14 AM
This alternative worked for us: https://community.databricks.com/t5/data-engineering/is-it-possible-to-import-functions-from-a-modul...
 
					
				
				
			
		
 
					
				
				
			
		
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now