Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
Hi all,I would appreciate some clarity regarding the DLT and CDC. So my first question would be, when it comes to the "source" table in the synta, is that CDC table or? Further, if we want to use only databricks, would mounting foreign catalog be a g...
Hello,I want to refactor the notebook programatically. So, written the code as follows: import requestsimport base64# Databricks Workspace API URLsworkspace_url = f"{host}/api/2.0/workspace"export_url = f"{workspace_url}/export"import_url = f"{worksp...
Hi,I exported notebook from my workspace into my local machine and want to read it in my python code .Is there a way to read the content of my notebook programmatically and make necessary changes and save as dbc/html notebook.
Not sure what you are trying to accomplish here. If you want to export a notebook as python to do manual editing locally, and then import it back into your workspace why not use repos and connect to it using VSCode etc? You can export the notebook as...
Hi @Brad - Unfortunately, it's not possible today to embed Dash in a Databricks notebook cell without our Enterprise-level databricks-dash library. Longer term, we are working towards Dash natively working within Databricks notebooks, but that timeli...
When creating a Foreign Catalog SQL Server Connection, a port number is required. However, many sql servers have dynamic ports and the port number keeps changing. Is there a solution for this?In most common cases, it should allow instance name instea...
Hi. In Databricks workflows, I submit a spark job (Type = "Spark Submit"), and a bunch of parameters, starting with --py-files.This works where all the files are in the same s3 path, but I get errors when I put a "common" module in a different s3 pat...
This below is catered for yarn modeif your application code primarily consists of Python files and does not require a separate virtual environment with specific dependencies, you can use the --py-files argument in spark-submitspark-submit --verbose ...
Working on an asset bundle/yml file to deploy a job and some notebooks. How to specify within the yml file a trigger to run the job when files arrive at a specified location? Thanks in advance!
I have created Python wheel file with simple file structure and uploaded into cluster library and was able to run the packages in Notebook but, when I am trying to create a Job using python wheel and provide the package name and run the task it fails...
There you can see a complete template project with (the new!!!) Databricks Asset Bundles tool and a python wheel task. Please, follow the instructions for deployment.https://github.com/andre-salvati/databricks-template
Hi,I am using the v0.205.0 version of the CLI. I wanted to test the demo project (databricks bundle init) of the Databricks Asset Bundles, however I am getting an error after databricks bundle deploy (validate is ok). artifacts.whl.AutoDetect: Detec...
There you can see a complete template project with Databricks Asset Bundles and a python wheel task. Please, follow the instructions for deployment.https://github.com/andre-salvati/databricks-template
I'm using Python Wheel Task in Databricks job with WHEEL dependencies. However, the cluster installed the dependencies as JAR instead of WHEEL. Is this an expected behavior or a bug?
There you can see a complete template project with a python wheel task and Databricks Asset Bundles. Please, follow the instructions for deployment.https://github.com/andre-salvati/databricks-template
I'm using Python (as Python wheel application) on Databricks.I deploy & run my jobs using dbx.I defined some Databricks Workflow using Python wheel tasks.Everything is working fine, but I'm having issue to extract "databricks_job_id" & "databricks_ru...
There you can see a complete template project with Databricks Asset Bundles and python wheel task. Please, follow the instructions for deployment.https://github.com/andre-salvati/databricks-templateIn particular, take a look at the workflow definitio...
I have the Databricks VS code extension setup to develop and run jobs remotely. (with Databricks Connect).I enjoy working on notebooks within the native Databricks workspace, especially for exploratory work because I can execute blocks of code step b...
When I use the current "result table" option it does not show the table results. This occurs when running SQL commands and the display() function for DataFrames.It is not linked to a Databricks runtime, since it occurs on all runtimes. I am not allow...
In spark UI, I can see the application running with the application ID, from this spark UI, could I able to see the which notebook is running with that applications is this possible?I am interested in learning more about the jobs, stage how it works ...
https://spark.apache.org/docs/3.1.1/api/python/reference/api/pyspark.SparkContext.setJobDescription.htmlspark.setJobDescription("my name") will make your life easier. Just put it in the notebook.You should also put it after each action (show, count, ...