cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Dbx installation for local development on Vscode

mickniz
Contributor

Hi Folks,

Since databricks is now asking to use DBX instead of databricks-connect ,We are trying to set up our local environment following the guide.

dbx by Databricks Labs - Azure Databricks | Microsoft Learn

Have create conf/deployment.yml and dbx/project.json and added to root folder of my repo.

Deployment file looks like below

build:
  python: "pip"
environments:
  default:
    workflows:
      - name: "dbx-xxxx-job"
        spark_python_task:
          python_file: "C:\framework\library\py\platformdata\tests\test_dbutils.py"

And I am getting below error while running

dbx execute workflowname -- cluster name =""

Any suggestion here?

                                                 │

│ C:\Python\lib\site-packages\yaml\scanner.py:1149 in scan_flow_scalar               │

│                                                 │

│  1146 │  │  start_mark = self.get_mark()                           │

│  1147 │  │  quote = self.peek()                                │

│  1148 │  │  self.forward()                                  │

│ ❱ 1149 │  │  chunks.extend(self.scan_flow_scalar_non_spaces(double, start_mark))        │

│  1150 │  │  while self.peek() != quote:                            │

│  1151 │  │  │  chunks.extend(self.scan_flow_scalar_spaces(double, start_mark))        │

│  1152 │  │  │  chunks.extend(self.scan_flow_scalar_non_spaces(double, start_mark))      │

│                                                 │

│ C:\Python\lib\site-packages\yaml\scanner.py:1223 in scan_flow_scalar_non_spaces         │

│                                                 │

│  1220 │  │  │  │  │  self.scan_line_break()                        │

│  1221 │  │  │  │  │  chunks.extend(self.scan_flow_scalar_breaks(double, start_mark))    │

│  1222 │  │  │  │  else:                                   │

│ ❱ 1223 │  │  │  │  │  raise ScannerError("while scanning a double-quoted scalar", start_ma │

│  1224 │  │  │  │  │  │  │  "found unknown escape character %r" % ch, self.get_mark())  │

│  1225 │  │  │  else:                                     │

│  1226 │  │  │  │  return chunks        

3 REPLIES 3

xiangzhu
Contributor III

hi @Ritu Kumari​ 

Is this the full error message ?

BTW, at least the python file path is not compliant: File references - dbx

mickniz
Contributor

fixed this issue. but I am getting another issue while syncing local repo with Workspace in Databricks UI.

When I run command

dbx sync repo -d workspace name --source.

Command runs fine . I can see that dbfs but not under workspace in Databricks page.

Any suggestion here

hmm... `dbx sync repo -d [repo_name]` works well from my side

coudl you please check:

  1. use the latest dbx version, currently 0.8.7
  2. it should be under Repos menu, not Workspace
  3. if your files are not delcared in the gitignore, check the help of `dbx sync repo --help`, there're multiple exclusion settings
  4. default behavior of `dbx sync repo` keeps watching the files changes, you can modifiy a file, and check if you see in the output, sth similar to my example that changed the file `__ini__.py`

image 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group