cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Using a Virtual environment

hvsk
New Contributor

Hi All,

We are working on training NHits/TFT (a Pytorch-forecasting implementation) for timeseries forecasting. However, we are having some issues with package dependency conflicts.

Is there a way to consistently use a virtual environment across cells in a Databricks notebooks? What is the recommended approach here?

Here is a screenshot of a minimal example where we are creating a virtual environment in one cell and in the next cell we are unable to access it.

Thank You

2 REPLIES 2

Anonymous
Not applicable

@Harsh Kalra​ :

there are a few ways to manage package dependencies in Databricks notebooks:

  1. Use Databricks' built-in package management: Databricks allows you to install packages using a built-in package manager. You can do this through the UI by going to the "Libraries" tab of your cluster and adding the packages you need. Alternatively, you can use the dbutils.library.install command to install packages programmatically.
  2. Use a virtual environment: You can create a virtual environment using conda or pipenv and install all the packages you need there. Then, you can activate the environment in each cell where you need to use those packages. However, note that this approach can be tricky to set up, as you'll need to make sure that all the packages you need are installed in the virtual environment and that you activate the environment correctly in each cell.
  3. Install packages in each cell: You can install the necessary packages in each cell where you need them using pip
  4. However, this can be time-consuming and can clutter your notebook code.

The recommended approach may depend on your specific use case and the complexity of your package dependencies. If you're having issues with package dependency conflicts, it may be worth trying the built-in package management approach first, as this will automatically handle dependency resolution.

Anonymous
Not applicable

Hi @Harsh Kalra​ 

Hope everything is going great.

Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we can help you. 

Cheers!

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.