cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Asset bundle doesn't sync files to workspace

JonathanFlint
New Contributor II

I've created a completely fresh project with a completely empty workspace

Locally I have the databricks CLI version 0.230.0 installed

I run

databricks bundle init default-python

I have auth set up with a PAT generated by an account which has workspace admin. when I run bundle deploy it deploys the resources and jobs, and created the pipeline but does not sync the src folder and files to the workspace and it does not give an error.

The resources which do deploy when viewed in the workspace give an error saying the source code for them cant be found in the workspace.

If I add a sync block to the databricks.yml file at the top level

sync:
  include:
    - src/**
and run validate I get 

Warning: Pattern src\** does not match any files
at sync.include[0]
in databricks.yml:11:7

Warning: There are no files to sync, please check your .gitignore

I've tried all possible formatting of src I can think of and always get the same error, I've tried 

src, src/, src/*, .src, .src/, .src/*, src/*.ipynb

nothing works and nothing I do other than manually syncing my entire directory with a manual databricks bundle sync or vscode extension sync actually pushes the files to the remote workspace.

After I manually sync the files the errors on the resources in the workspace disappear

The only thing I changed was the catalogue being used by the dlt pipeline in my_project.pipline.yml file to use an existing catalogue because the workspace is enabled for UC instead of the hive_metastore

I've also tried adding the include to the top level include mapping

bundle:
  name: my_project

include:
  - resources/*.yml
 
Lost an entire day to this now and I can't figure out why the basic template generated by the cli itself doesn't deploy any notebooks or other files.

 

 

7 REPLIES 7

-werners-
Esteemed Contributor III

have you looked into the .gitignore file?
chances are there is an entry with /resources/*
If so, you can remove everything from .gitignore which you think should be deployed.
For sure the resources folder.

Theres no entry for the resources folder in the git ignore, its not the resources I'm having trouble with, they are created successfully in the workspace but the notebooks they use are not copied to the workspace

 

I've tried deleting the gitignore file entirely, and theres no references to the src folder in the git ignore to begin with anyway

 

saurabh18cs
Contributor II

do you not see this in your databricks workspace users:

saurabh18cs_0-1729780112365.png

 

Yes I can see the folder structure but nothing in the src folder apart from a "my_project.egg-info" subfolder with some txt files in it:

JonathanFlint_0-1729869739834.pngJonathanFlint_1-1729869754404.png

 

 

filipniziol
Contributor III

Hi @JonathanFlint ,

1. Could you remove the below line. Let's try to make the deployment work without any filters:

sync:
  include:
    - src/**
2. Could you run "databricks bundle destroy" and then try "databricks bundle deploy". If you were "deploying manually" then I guess you were adding some notebooks manually, but the deployments keep the history of what was deployed and not and probably it is kind of messed up after those manual changes.

When I first encountered the issue I did not have the 

sync:
  include:
    - src/**
block in my databricks.yml file, i made no changes to the databrick.yml file before I first did the deployment
 
I've also tried destroying the bundle in the remote workspace and deploying it again, same result

Mathias_Peters
Contributor

Hi, I had a similar problem today. I changed the way, that we deploy our main bundle using pull requests and in order to play around with it locally, I copied python and dbt code into the databricks src dir (that is normally done during a github workflow). To avoid accidental commits, I also added the two dirs to .gitignore on the top level of our repo  (i.e. one level above the databricks dir). After that, bundle deploy stopped copying files. 

Is that intended? That feels like a bug to me, tbh

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group