cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results for 
Search instead for 
Did you mean: 

Is there a way to prevent databricks-connect from installing a global IPython Spark startup script?

jay-cunningham
New Contributor

I'm currently using databricks-connect through VS Code on MacOS. However, this seems to install (and re-install upon deletion) an IPython startup script which initializes a SparkSession. This is fine as far as it goes, except that this script is *global* -- even when my current Python environment doesn't have databricks-connect installed it still tries to run the script before erroring out.

I mean no offense to Databricks, but this is kind of baffling to me? I don't understand why they would have done this in the first place. I don't *actually* want things besides builtins to be defined when I start up an IPython session, and triply so if it's just going to error out.

I don't suppose anyone knows a way to prevent this from happening?

1 REPLY 1

mark_ott
Databricks Employee
Databricks Employee

Databricks Connect on MacOS (and some other platforms) adds a file to the global IPython startup folder, which causes every new IPython session—including those outside the Databricks environment—to attempt loading this SparkSession initialization. This is a frequent source of frustration for users who work in multiple Python environments, as it breaks the expected isolation of virtual environments and can cause errors when databricks-connect is not installed.

Why This Happens

  • Global IPython Startup Folder: On MacOS, the startup scripts are dropped into ~/.ipython/profile_default/startup/. Any .py file here runs for all IPython and Jupyter launches, not just those configured for Databricks Connect.

  • No Environment Awareness: The script does not check for the active environment before running, so you'll get errors if databricks-connect isn't present.

  • Persistence: If you delete the file and then reconfigure or upgrade Databricks Connect, the installer recreates the script.

Workarounds and Solutions

1. Remove Startup Script (Temporary)

Delete the offending file in ~/.ipython/profile_default/startup/, which is usually named something like 00-databricks-connect.py.

bash
rm ~/.ipython/profile_default/startup/00-databricks-connect.py

Note: This solution is temporary. Upgrading or re-running databricks-connect setup will likely restore the file.

2. Disable Global IPython Startup for Non-Databricks Environments

A more sustainable approach is to modify the startup script to check if the environment has databricks-connect installed before running, or to only import if appropriate. You can add guards like this to the top of the IPython startup script:

python
try: import databricks.connect # Do your SparkSession init here except ImportError: pass # databricks-connect not installed, do nothing

This prevents errors but does not resolve the surprise of global startup actions.

3. Use Virtualenv-specific IPython Profile

Create an IPython profile just for Databricks Connect:

  • Launch IPython and run:

    bash
    ipython profile create databricks
  • Move (not copy!) the Databricks startup script into this profile's startup directory:

    bash
    mv ~/.ipython/profile_default/startup/00-databricks-connect.py ~/.ipython/profile_databricks/startup/
  • Launch IPython with

    bash
    ipython --profile=databricks

    Only this profile will initialize the SparkSession.

4. File an Issue or Uninstall Globally

Many users have requested Databricks to make their installer less intrusive. Consider upvoting or filing a feature request on their GitHub or feedback forums. Meanwhile, uninstalling databricks-connect globally and only installing it within an isolated environment also prevents the global script from being added.

Summary Table: Approaches

Approach Effectiveness Reinstalls? Effort
Remove script from ~/.ipython/startup Temporary Yes Low
Add try/except in script Stops errors Yes Low-Moderate
Use custom IPython profile Most robust No Moderate
Feedback to Databricks Fixes root cause N/A Varies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now