How to copy notebooks from local to the tarrget folder via asset bundles
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Tuesday
Hi all,
I am able to deploy Databricks assets to the target workspace. Jobs and workflows can also be created successfully.
But I have aspecial requirement, that I copy the note books to the target folder on databricks workspace.
Example:
on Local I have such a structure:
databricks/
├─ fixtures/
├─ src/
│ ├─ SubFolder1/
│ │ ├─ Notebook1.ipynb
│ ├─ SubFolder2/
│ │ ├─ Notebook2.ipynb
│ ├─ Notebook3.ipnyb
├─ resources/
│ ├─ dab.job.yml.css
├─ .gitignore
On Databricks side, I would like to have this:
Workspace/
├─ Repos/
├─ Shared/
├─ Users/
├─ SubFolder1/
│ ├─ Notebook1.ipynb
├─ Subfolder2/
│ ├─ Notebook2.ipynb
├─ Notebook3.ipnyb
How can I set DAB that I can copy the notebooks to the desired folder on workspace and not to the root path with other files
- Labels:
-
Workflows
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Tuesday
Hi @BobCat62, we meet again 😃,
I hope you are doing great.
You can deploy your notebooks to your workspace, which are even outside your databricks.yml (bundle root path) using the sync paths mapping. Though by default all these resources go to your specified workspace root path.
If you want to send these files to any target root path, you can do it using the following ways
#databricks.yml
targets:
test:
workspace:
file_path: /Workspace/
sync:
path:
- .src/subfolder1/*.ipynb
- .src/subfolder2/*.ipynb
- ./src/notebook3.ipynb
- I have placed all the sync and workspace mappings inside a target mapping you can have it at root level as well.
- I specified all the notebooks separately but in your case since all are in the src directory you can directly just put the ./src/* as well
- I have assumed that databricks.yml is in the root directory , if not you can change the path with relative to that.
Here are the databricks docs :
- https://docs.databricks.com/aws/en/dev-tools/bundles/settings#file_path
- https://docs.databricks.com/aws/en/dev-tools/bundles/settings#paths
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Wednesday - last edited Wednesday
Hello @ashraf1395 ,
Nice to hear you and thank you for your hints.
Actually with your idea, I could reach half of my aim 😊
you can see here the folder structure in my VS code:
and here is part of my `databrick.yml` file:
If I deploy with this yml, I get this error:
Error: stat .src/emob1/*.ipynb: no such file or directory
I I remove this part:
I would like to have emob1 and emob2 directly under Workspace, and not under Databricks/src
Do you have any Idea?
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Wednesday
Hi @BobCat62 , I made a small typo it would be ./src/emob1/*.ipynb or you can just keep it ./src/emob1/
If you databricks.yml file is outside the databricks folder the path will be
./databricks/src/emob1/
and with file_path: /Workspace/ set to this. Your target file will be added at the correct place only drawback databricks syncs the local path dynamically wrt databricks.yml file
So your databricks.yml file is outside of databricks repo i.e. in parallel with it your databricks workspace structure will look like this
/Workspace/databricks/src/emob1/
but if you want emob1 directly I don't see any way we can change the path of bundle root location.
If it is not mandatory to keep the emob1 and 2 files inside databricks/src you can keep them in parallel with your databricks.yml
Whick wont be a good practice I guess.
If you can find a way to change the bundle root location using an environment variable etc (though I don't find something like this anywhere )you work can be done or for quick workaround you can keep your emob files in parallel with databricks.yml file
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thursday
@ashraf1395 Thank you so much...
You mean, if I could put the databricks.yml file inside the databricks folder and relocate my emob folders then I can achieve my aim
As I understand it is not possible to reloacte the databricks.yml because it should be in the root. Is my understanding correct?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thursday
Right, @BobCat62 databricks.yml should be in the root location , if you change the place of your emob folders wrt databricks.yml then you can achieve it like maybe taking out the emob folder and keeping them in root.
I don't know any other solution to this
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thursday
What are the permissions to this databricks directory? Can someone delete this directory or any file? On Shared workspace everyone can delete bundle files or bundle directory, even if in databricks.yml I provided permissions only to admins ('CAN MANAGE').
permissions:
- user_groups: admins
level: CAN_MANAGE
I'm looking a safe place and common space for all bundles. I don't like the idea to have bundles on User Workspaces (databricks suggest this approach).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thursday
Can you store it in a service principal's workspace? I read this somewhere in the documentation but can't find it again.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thursday
Hi @kmodelew , You can change your workspace : root_path to maybe /Workspace or any other common location you want so that it can be view by all and accessed only by those who have the permissions.
By default the bundle path is /Workspace/users/current_user....
You can find more information about this here : https://docs.databricks.com/aws/en/dev-tools/bundles/settings#workspace

