cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Dbx installation for local development on Vscode

mickniz
Contributor

Hi Folks,

Since databricks is now asking to use DBX instead of databricks-connect ,We are trying to set up our local environment following the guide.

dbx by Databricks Labs - Azure Databricks | Microsoft Learn

Have create conf/deployment.yml and dbx/project.json and added to root folder of my repo.

Deployment file looks like below

build:
  python: "pip"
environments:
  default:
    workflows:
      - name: "dbx-xxxx-job"
        spark_python_task:
          python_file: "C:\framework\library\py\platformdata\tests\test_dbutils.py"

And I am getting below error while running

dbx execute workflowname -- cluster name =""

Any suggestion here?

                                                 │

│ C:\Python\lib\site-packages\yaml\scanner.py:1149 in scan_flow_scalar               │

│                                                 │

│  1146 │  │  start_mark = self.get_mark()                           │

│  1147 │  │  quote = self.peek()                                │

│  1148 │  │  self.forward()                                  │

│ ❱ 1149 │  │  chunks.extend(self.scan_flow_scalar_non_spaces(double, start_mark))        │

│  1150 │  │  while self.peek() != quote:                            │

│  1151 │  │  │  chunks.extend(self.scan_flow_scalar_spaces(double, start_mark))        │

│  1152 │  │  │  chunks.extend(self.scan_flow_scalar_non_spaces(double, start_mark))      │

│                                                 │

│ C:\Python\lib\site-packages\yaml\scanner.py:1223 in scan_flow_scalar_non_spaces         │

│                                                 │

│  1220 │  │  │  │  │  self.scan_line_break()                        │

│  1221 │  │  │  │  │  chunks.extend(self.scan_flow_scalar_breaks(double, start_mark))    │

│  1222 │  │  │  │  else:                                   │

│ ❱ 1223 │  │  │  │  │  raise ScannerError("while scanning a double-quoted scalar", start_ma │

│  1224 │  │  │  │  │  │  │  "found unknown escape character %r" % ch, self.get_mark())  │

│  1225 │  │  │  else:                                     │

│  1226 │  │  │  │  return chunks        

3 REPLIES 3

xiangzhu
Contributor

hi @Ritu Kumari​ 

Is this the full error message ?

BTW, at least the python file path is not compliant: File references - dbx

mickniz
Contributor

fixed this issue. but I am getting another issue while syncing local repo with Workspace in Databricks UI.

When I run command

dbx sync repo -d workspace name --source.

Command runs fine . I can see that dbfs but not under workspace in Databricks page.

Any suggestion here

hmm... `dbx sync repo -d [repo_name]` works well from my side

coudl you please check:

  1. use the latest dbx version, currently 0.8.7
  2. it should be under Repos menu, not Workspace
  3. if your files are not delcared in the gitignore, check the help of `dbx sync repo --help`, there're multiple exclusion settings
  4. default behavior of `dbx sync repo` keeps watching the files changes, you can modifiy a file, and check if you see in the output, sth similar to my example that changed the file `__ini__.py`

image 

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.