cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

sage5616
by Valued Contributor
  • 4520 Views
  • 2 replies
  • 3 kudos

Resolved! Running local python code with arguments in Databricks via dbx utility.

I am trying to execute a local PySpark script on a Databricks cluster via dbx utility to test how passing arguments to python works in Databricks when developing locally. However, the test arguments I am passing are not being read for some reason. Co...

  • 4520 Views
  • 2 replies
  • 3 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 3 kudos

You can pass parameters using dbx launch --parametersIf you want to define it in the deployment template please try to follow exactly databricks API 2.1 schema https://docs.databricks.com/dev-tools/api/latest/jobs.html#operation/JobsCreate (for examp...

  • 3 kudos
1 More Replies
ACK
by New Contributor II
  • 1962 Views
  • 2 replies
  • 2 kudos

Resolved! How do I pass kwargs to wheel method?

Hi,I have a method named main it takes **kwargs as a parameter. def main(**kwargs): parameterOne = kwargs["param-one"] parameterTwo = kwargs["param-two"] parameterThree = kwargs["param-optional-one"] if "param-optional-one" in kwargs else...

  • 1962 Views
  • 2 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

it is command-line parameters so it is like ---param-one=testyou can test it with ArgumentParserfrom argparse import ArgumentParser   parser = ArgumentParser() parser.add_argument("--param-one", dest="parameterOne")   args = parser.parse_args()

  • 2 kudos
1 More Replies
Will_Sullivan
by New Contributor
  • 917 Views
  • 0 replies
  • 0 kudos

How to solve Error in Databricks Academy course DE 4.2 & 4.3, run classroom-setup-4.2 error, "[SQLITE_ERROR] SQL error or missing database (no such table: users)"

Any one know how to solve this error?Course: Data Engineering with Databricks, Notebook: DE 4.2 - Providing Options for External SourcesAttempts to fix: Detached and reattached my cluster and started it again.%run ../Includes/Classroom-Setup-4.2resul...

  • 917 Views
  • 0 replies
  • 0 kudos
bl12
by New Contributor II
  • 1581 Views
  • 2 replies
  • 2 kudos

Resolved! Any ways to power a Databricks SQL dashboard widget with a dynamic query?

Hi, I'm using Databricks SQL and I need to power the same widget in a dashboard with a dynamic query. Are there any recommended solutions for this? For more context, I'm building a feature that allows people to see the size of something. That size is...

  • 1581 Views
  • 2 replies
  • 2 kudos
Latest Reply
AmanSehgal
Honored Contributor III
  • 2 kudos

I believe reDash isn't built that way within Databricks. It's still very limited in its capabilities. I've two solutions for you. I haven't tried any but see if it works for you:Use preset with DB SQL. A hack - read below:I'm assuming you have one wi...

  • 2 kudos
1 More Replies
Krish-685291
by New Contributor III
  • 725 Views
  • 2 replies
  • 0 kudos

Which is the recommended way to write the data back to the delta lake?

Hi,I wanted to understand whether my approach to deal with delta lake is correct or not? 1. First time I create a delta lake using the following command.   -> df_json.write.mode('overwrite').format('delta').save(delta_silver + json_file_path )  2. I ...

image
  • 725 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hey there @Krishna Puthran​ Hope everything is going great!Does @Kaniz Fatma​'s answer help? If it does, would you be happy to mark it as best? If it doesn't, please tell us so we can help you further.We'd love to hear from you.Cheers!

  • 0 kudos
1 More Replies
devashishraverk
by New Contributor II
  • 1342 Views
  • 2 replies
  • 2 kudos

Not able to create SQL Endpoint in Databricks SQL (Databricks 14-day free trial)

Hi,I am not able to create SQL Endpoint getting below error, I have selected Cluster size as 2X-Small on Azure platform:Clusters are failing to launch. Cluster launch will be retried. Details for the latest failure: Error: Error code: PublicIPCountLi...

  • 1342 Views
  • 2 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hey there @Devashish Raverkar​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear f...

  • 2 kudos
1 More Replies
Direo
by Contributor
  • 4076 Views
  • 3 replies
  • 2 kudos

Default indentation for Python has changed after migration to the new workspace

In our old workspace default identation was 2 spaces. In our new one it has changed to 4 spaces. Of course you can manually change it back to 2 spaces as we used to have, but it does not work. Does anyone know how to solve this issue?

  • 4076 Views
  • 3 replies
  • 2 kudos
Latest Reply
ranged_coop
Valued Contributor II
  • 2 kudos

You do have that option of Settings --> User Settings (Admin Settings ? not sure - I don't have admin access) --> Notebook Settings --> Default indentation for Python cells (in spaces)This will change the indentation for newer cells, but existing one...

  • 2 kudos
2 More Replies
Michael_Galli
by Contributor II
  • 2034 Views
  • 4 replies
  • 2 kudos

Resolved! Unittest in PySpark - how to read XML with Maven com.databricks.spark.xml ?

When writing unit tests with unittest / pytest in PySpark, reading mockup datasources with built-in datatypes like csv, json (spark.read.format("json")) works just fine.But when reading XML´s with spark.read.format("com.databricks.spark.xml") in the ...

  • 2034 Views
  • 4 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

Please install spark-xml from Maven. As it is from Maven you need to install it for cluster which you are using in cluster settings (alternatively using API or CLI)https://mvnrepository.com/artifact/com.databricks/spark-xml

  • 2 kudos
3 More Replies
mj2022
by New Contributor III
  • 1547 Views
  • 2 replies
  • 2 kudos

Spark Streaming with SASL_SSL Kafka throwing java.nio.file.NoSuchFileException: dbfs:/mnt/**/kafka.client.truststore.imported.jks

I testing Spark Streaming working withSASL_SSL enabled kafka broker in a notebook.as per this guide https://docs.databricks.com/spark/latest/structured-streaming/kafka.htmli have copied jsk files in an s3 bucket and mounted it in dbfs.In notebook wh...

  • 1547 Views
  • 2 replies
  • 2 kudos
Latest Reply
mj2022
New Contributor III
  • 2 kudos

Thanks..Yes '/dbfs/mnt/xxxx/kafka.client.truststore.imported.jks'  path worked. Also other workaround we got it working, is copy the file from s3 to filesystem using init script and use filepath.

  • 2 kudos
1 More Replies
whatthespark
by New Contributor II
  • 2153 Views
  • 4 replies
  • 1 kudos

Inconsistent duplicated row with Spark (Databricks on MS Azure)

I'm having a weird behavior with Apache Spark, which I run in a Python Notebook on Azure Databricks. I have a dataframe with some data, with 2 columns of interest: name and ftimeI found that I sometime have duplicated values, sometime not, depending ...

  • 2153 Views
  • 4 replies
  • 1 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 1 kudos

I would like to see how you create the df dataframe.In pyspark you can get weird results if you do not clear state, or when you reuse dataframe names.

  • 1 kudos
3 More Replies
jwilliam
by Contributor
  • 1251 Views
  • 3 replies
  • 2 kudos

Resolved! Cannot use Web Terminal when creating cluster with Custom container.

I follow this guide to create cluster with custom container: https://docs.databricks.com/clusters/custom-containers.htmlHowever, when cluster created, I coudn't access to web terminal. It resulted in 502 bad gateway.

image
  • 1251 Views
  • 3 replies
  • 2 kudos
Latest Reply
Ravi
Valued Contributor
  • 2 kudos

This is a limitation at the moment. Enabling Docker Container Services disables web terminal.https://docs.databricks.com/clusters/web-terminal.html#limitations

  • 2 kudos
2 More Replies
pantelis_mare
by Contributor III
  • 2981 Views
  • 6 replies
  • 2 kudos

Resolved! Delta log statistics - timestamp type not working

Hello team!As per the documentation, I understand that the table statistics can be fetched through the delta log (eg min, max, count) in order to not read the underlying data of a delta table.This is the case for numerical types, and timestamp is sup...

max value image.png
  • 2981 Views
  • 6 replies
  • 2 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 2 kudos

are you sure the timestamp column is a valid spark-timestamp-type?

  • 2 kudos
5 More Replies
niels
by New Contributor III
  • 1789 Views
  • 5 replies
  • 12 kudos

Resolved! Change cluster mid-pipeline

I have a notebook functioning as a pipeline, where multiple notebooks are chained together. The issue I'm facing is that some of the notebooks are spark-optimized, others aren't, and what I want is to use 1 cluster for the former and another for the ...

  • 1789 Views
  • 5 replies
  • 12 kudos
Latest Reply
Kaniz
Community Manager
  • 12 kudos

Hi @Niels Ota​ , We haven’t heard from you on the last response from @Prabakar Ammeappin​ , and I was checking back to see if you have a resolution yet. If you have any solution, please share it with the community as it can be helpful to others. Othe...

  • 12 kudos
4 More Replies
ThomasKastl
by Contributor
  • 3064 Views
  • 5 replies
  • 4 kudos

Calling Databricks API from Databricks notebook

A similar question has already been added, but the reply is very confusing to me. Basically, for automated jobs, I want to log the following information from inside a Python notebook that runs in the job: - What is the cluster configuration (most im...

  • 3064 Views
  • 5 replies
  • 4 kudos
Latest Reply
jose_gonzalez
Moderator
  • 4 kudos

hi @Thomas Kastl​,Just a friendly follow-up. Did any of the responses help you to resolve your question? if it did, please mark it as best. Otherwise, please let us know if you still need help.

  • 4 kudos
4 More Replies
shawncao
by New Contributor II
  • 924 Views
  • 3 replies
  • 2 kudos

best practice of using data bricks API

Hello, I'm building a Databricks connector to allow users to issue command/SQL from a web app.In general, I think the REST API is okay to work with, though it's pretty tedious to write wrap code for each API call.[Q1]Is there an official (or semi-off...

  • 924 Views
  • 3 replies
  • 2 kudos
Latest Reply
shawncao
New Contributor II
  • 2 kudos

I don't know if I fully understand DBX, sounds like a job client to manage jobs and deployment and I don't see NodeJS support for this project yet. My question was about how to "stream" query results back from Databricks in a NodeJs application, curr...

  • 2 kudos
2 More Replies
Labels
Top Kudoed Authors