Databricks Community

mickniz · ‎11-09-2022

Hi Folks,

Since databricks is now asking to use DBX instead of databricks-connect ,We are trying to set up our local environment following the guide.

dbx by Databricks Labs - Azure Databricks | Microsoft Learn

Have create conf/deployment.yml and dbx/project.json and added to root folder of my repo.

Deployment file looks like below

build:
  python: "pip"
environments:
  default:
    workflows:
      - name: "dbx-xxxx-job"
        spark_python_task:
          python_file: "C:\framework\library\py\platformdata\tests\test_dbutils.py"

And I am getting below error while running

dbx execute workflowname -- cluster name =""

Any suggestion here?

│

│ C:\Python\lib\site-packages\yaml\scanner.py:1149 in scan_flow_scalar │

│ │

│ 1146 │ │ start_mark = self.get_mark() │

│ 1147 │ │ quote = self.peek() │

│ 1148 │ │ self.forward() │

│ ❱ 1149 │ │ chunks.extend(self.scan_flow_scalar_non_spaces(double, start_mark)) │

│ 1150 │ │ while self.peek() != quote: │

│ 1151 │ │ │ chunks.extend(self.scan_flow_scalar_spaces(double, start_mark)) │

│ 1152 │ │ │ chunks.extend(self.scan_flow_scalar_non_spaces(double, start_mark)) │

│ │

│ C:\Python\lib\site-packages\yaml\scanner.py:1223 in scan_flow_scalar_non_spaces │

│ │

│ 1220 │ │ │ │ │ self.scan_line_break() │

│ 1221 │ │ │ │ │ chunks.extend(self.scan_flow_scalar_breaks(double, start_mark)) │

│ 1222 │ │ │ │ else: │

│ ❱ 1223 │ │ │ │ │ raise ScannerError("while scanning a double-quoted scalar", start_ma │

│ 1224 │ │ │ │ │ │ │ "found unknown escape character %r" % ch, self.get_mark()) │

│ 1225 │ │ │ else: │

│ 1226 │ │ │ │ return chunks

xiangzhu · ‎11-18-2022

hi @Ritu Kumari

Is this the full error message ?

BTW, at least the python file path is not compliant: File references - dbx

mickniz · ‎11-21-2022

fixed this issue. but I am getting another issue while syncing local repo with Workspace in Databricks UI.

When I run command

dbx sync repo -d workspace name --source.

Command runs fine . I can see that dbfs but not under workspace in Databricks page.

Any suggestion here

xiangzhu · ‎11-21-2022

hmm... `dbx sync repo -d [repo_name]` works well from my side

coudl you please check:

use the latest dbx version, currently 0.8.7
it should be under Repos menu, not Workspace
if your files are not delcared in the gitignore, check the help of `dbx sync repo --help`, there're multiple exclusion settings
default behavior of `dbx sync repo` keeps watching the files changes, you can modifiy a file, and check if you see in the output, sth similar to my example that changed the file `__ini__.py`