I am trying to execute a local PySpark script on a Databricks cluster via dbx utility to test how passing arguments to python works in Databricks when developing locally. However, the test arguments I am passing are not being read for some reason. Co...
You can pass parameters using dbx launch --parametersIf you want to define it in the deployment template please try to follow exactly databricks API 2.1 schema https://docs.databricks.com/dev-tools/api/latest/jobs.html#operation/JobsCreate (for examp...
Hi,I have a method named main it takes **kwargs as a parameter. def main(**kwargs):
parameterOne = kwargs["param-one"]
parameterTwo = kwargs["param-two"]
parameterThree = kwargs["param-optional-one"] if "param-optional-one" in kwargs else...
it is command-line parameters so it is like ---param-one=testyou can test it with ArgumentParserfrom argparse import ArgumentParser
parser = ArgumentParser()
parser.add_argument("--param-one", dest="parameterOne")
args = parser.parse_args()
Any one know how to solve this error?Course: Data Engineering with Databricks, Notebook: DE 4.2 - Providing Options for External SourcesAttempts to fix: Detached and reattached my cluster and started it again.%run ../Includes/Classroom-Setup-4.2resul...
Hi, I'm using Databricks SQL and I need to power the same widget in a dashboard with a dynamic query. Are there any recommended solutions for this? For more context, I'm building a feature that allows people to see the size of something. That size is...
I believe reDash isn't built that way within Databricks. It's still very limited in its capabilities. I've two solutions for you. I haven't tried any but see if it works for you:Use preset with DB SQL. A hack - read below:I'm assuming you have one wi...
Hi,I wanted to understand whether my approach to deal with delta lake is correct or not? 1. First time I create a delta lake using the following command. -> df_json.write.mode('overwrite').format('delta').save(delta_silver + json_file_path ) 2. I ...
Hey there @Krishna Puthran​ Hope everything is going great!Does @Kaniz Fatma​'s answer help? If it does, would you be happy to mark it as best? If it doesn't, please tell us so we can help you further.We'd love to hear from you.Cheers!
Hi,I am not able to create SQL Endpoint getting below error, I have selected Cluster size as 2X-Small on Azure platform:Clusters are failing to launch. Cluster launch will be retried.
Details for the latest failure: Error: Error code: PublicIPCountLi...
Hey there @Devashish Raverkar​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear f...
In our old workspace default identation was 2 spaces. In our new one it has changed to 4 spaces. Of course you can manually change it back to 2 spaces as we used to have, but it does not work. Does anyone know how to solve this issue?
You do have that option of Settings --> User Settings (Admin Settings ? not sure - I don't have admin access) --> Notebook Settings --> Default indentation for Python cells (in spaces)This will change the indentation for newer cells, but existing one...
When writing unit tests with unittest / pytest in PySpark, reading mockup datasources with built-in datatypes like csv, json (spark.read.format("json")) works just fine.But when reading XML´s with spark.read.format("com.databricks.spark.xml") in the ...
Please install spark-xml from Maven. As it is from Maven you need to install it for cluster which you are using in cluster settings (alternatively using API or CLI)https://mvnrepository.com/artifact/com.databricks/spark-xml
I testing Spark Streaming working withSASL_SSL enabled kafka broker in a notebook.as per this guide https://docs.databricks.com/spark/latest/structured-streaming/kafka.htmli have copied jsk files in an s3 bucket and mounted it in dbfs.In notebook wh...
Thanks..Yes '/dbfs/mnt/xxxx/kafka.client.truststore.imported.jks' path worked. Also other workaround we got it working, is copy the file from s3 to filesystem using init script and use filepath.
I'm having a weird behavior with Apache Spark, which I run in a Python Notebook on Azure Databricks. I have a dataframe with some data, with 2 columns of interest: name and ftimeI found that I sometime have duplicated values, sometime not, depending ...
I follow this guide to create cluster with custom container: https://docs.databricks.com/clusters/custom-containers.htmlHowever, when cluster created, I coudn't access to web terminal. It resulted in 502 bad gateway.
This is a limitation at the moment. Enabling Docker Container Services disables web terminal.https://docs.databricks.com/clusters/web-terminal.html#limitations
Hello team!As per the documentation, I understand that the table statistics can be fetched through the delta log (eg min, max, count) in order to not read the underlying data of a delta table.This is the case for numerical types, and timestamp is sup...
I have a notebook functioning as a pipeline, where multiple notebooks are chained together. The issue I'm facing is that some of the notebooks are spark-optimized, others aren't, and what I want is to use 1 cluster for the former and another for the ...
Hi @Niels Ota​ , We haven’t heard from you on the last response from @Prabakar Ammeappin​ , and I was checking back to see if you have a resolution yet. If you have any solution, please share it with the community as it can be helpful to others. Othe...
A similar question has already been added, but the reply is very confusing to me.
Basically, for automated jobs, I want to log the following information from inside a Python notebook that runs in the job:
- What is the cluster configuration (most im...
hi @Thomas Kastl​,Just a friendly follow-up. Did any of the responses help you to resolve your question? if it did, please mark it as best. Otherwise, please let us know if you still need help.
Hello, I'm building a Databricks connector to allow users to issue command/SQL from a web app.In general, I think the REST API is okay to work with, though it's pretty tedious to write wrap code for each API call.[Q1]Is there an official (or semi-off...
I don't know if I fully understand DBX, sounds like a job client to manage jobs and deployment and I don't see NodeJS support for this project yet. My question was about how to "stream" query results back from Databricks in a NodeJs application, curr...