How to import local python file in notebook?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-04-2015 12:18 AM
for example I have one.py and two.py in databricks and I want to use one of the module from one.py in two.py. Usually I do this in my local machine by import statement like below
two.py__
from one import module1
.
.
.
How to do this in databricks???
- Labels:
-
Import
-
Local file
-
Notebook
-
Pyspark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-30-2019 03:20 AM
Tried this. %run runs the py file, print a print statement in external file. But what I want is to get a variable from external file and use it in current notebook. That doesn't work. (added some print statements below, the variable in file is called pseg_main)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-03-2019 12:13 AM
Is it possible to import a particular function using %run statement instead of running the whole notebook?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-18-2019 07:55 AM
Databricks is a very rigid environment. They don't promote modularizing code. I don't understand why they disable / dissuade use of such basic concepts which are so generic to all programming languages.
Even after so many years this is still a problem
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-25-2020 12:06 PM
exactly. I don't understand what is the path of the current notebook if I do ssh
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-18-2019 03:18 PM
The way to solve this problem is to add the path of your code to the system, then proceed to import modules selectively or all modules in a file.
You can download an example notebook from here https://github.com/javierorozco/databricks_import_python_module
import sys// Add the path to system, local or mounted S3 bucket, e.g. /dbfs/mnt/<path_to_bucket> sys.path.append('/databricks/driver/') sys.path.append('/databricks/driver/databricks_import_python_module/') sys.path.append('/databricks/driver/databricks_import_python_module/test.py')
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-24-2019 08:49 PM
@javier.orozco@realeuesit.com, I was able to work with the file created in your repo (test.py) but my with my own modules I am getting error. Anything I am missing?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-13-2020 06:27 PM
Works for me to upload my python file to dbfs using the databricks CLI:
dbfs cp mymodule.py dbfs:/path/to/module/mymodule.py --overwrite
Then the following works:
import sys
sys.path.append('/dbfs/path/to/module')
#the file is /dbfs/path/to/module/mymodule.py
import mymodule
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-28-2020 02:13 AM
It's a nice hack but how do I connect to the cluster driver to do real remote ssh development. I could connect via ssh to the driver but it seems there is a different python there which has no pyspark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-28-2020 02:11 AM
This is very cumbersome for someone who is used to develop data science projects with modules packages and classes and not just notebooks. Why does not data bricks allow this? I know about databricks-connect but it does not solve the problem as the driver runs locally and not remotely. What I want is a real ssh remote development experience.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-11-2021 09:43 AM
USE REPOS! 😁
Repos is able to call a function that is in a file in the same Github repo as long as Files is enabled in the admin panel.
So if I have utils.py with:
import pandas as pd
def clean_data():
# Load wine data
data = pd.read_csv("/dbfs/databricks-datasets/wine-quality/winequality-white.csv", sep=";")
print(data)
# Remove spaces from column names
data.rename(columns=lambda x: x.replace(' ', '_'), inplace=True)`
my notebook can call above with this:
import utils
utils.clean_data()
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-22-2025 08:14 AM
This alternative worked for us: https://community.databricks.com/t5/data-engineering/is-it-possible-to-import-functions-from-a-modul...


- « Previous
-
- 1
- 2
- Next »