cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results for 
Search instead for 
Did you mean: 

Why is importing python code supported in Repos but not in Workspaces ?

Mumrel
Contributor

Hi, 

we currently use a one repo approach which does not require a local development environment (we utilize azure dev ops and nutter for automated tests). We also have shared code accross pipelines and started with %run-sytle modularization and have now to import-style modularization in order for the code to be usable within DLTs as well.

We now noted that "import-style" modularization only works for Repos but not within workspaces and were a bit surprised by that. iIt feels unexpected and asymmetric. Can you please elaborate why this the case and if there are plans to change this in the future? 

We are aware that this is documented clearly here ("Source code can only be imported from files in a Databricks repo. You cannot import source code from a workspace file") , but no reasoning or outlook is given. And this is what my question is about. I also think that almost all use cases which work for workspaces can be translated to repos, but non the less I am curious why it is how it is and how it may develop in future.

Thanks very much for any insights shared

4 REPLIES 4

-werners-
Esteemed Contributor III

I cannot tell if or when the functionality is added to workspace, but the reason why it is different from repos is probably because with repos you actually make an external storage (git) the master of the files.  Whereas Workspace from the very beginning is built as being an entity on itself (with no/little git integration), an api etc.
Reworking all that to have the same functionality is a lot of work. 
There were several requests to have decent source control options using git.  So Databricks added functionality, without modifying an existing (working) part (Workspace).

375721
New Contributor II

you can generate a wheel package and

1)import that wheel package to cluster (libraries) so you can import where ever you want for your development

2)place that wheel into a dbfs location and refer it to the cluster

  • You then need to go to your cluster > Libraries > Install New > and select File Path/S3 as the Library Source

Mumrel
Contributor

Thanks for your replies - the core question remains open: Why is there a difference?  Will the difference vanish in future? Is it on purpose or due to restriction x, y or more like grown historically?

-werners-
Esteemed Contributor III

the why is most probably because of different development tracks/teams between workspace and repos.
If they will consilidate in functionality?  Can't tell, only Databricks knows that; but it seems reasonable to assume the files will also be added to workspace.
That being said, IMO using repos is the way to go for production workloads.

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!