cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

my_community2
by New Contributor III
  • 5702 Views
  • 10 replies
  • 1 kudos

Running notebooks on DataBricks in Azure blowing up all over since morning of Apr 5 (MST). Was there another poor deployment at DataBricks? This reall...

Running notebooks on DataBricks in Azure blowing up all over since morning of Apr 5 (MST). Was there another poor deployment at DataBricks? This really needs to stop. We are running premium DataBricks on Azure and calling notebooks from ADF.10.2 (inc...

image
  • 5702 Views
  • 10 replies
  • 1 kudos
Latest Reply
Prabakar
Esteemed Contributor III
  • 1 kudos

@Maciej G​ try using the below init script to increase the repl timeout.-------------------------------------- #!/bin/bash cat > /databricks/common/conf/set_repl_timeout.conf << EOL {  databricks.daemon.driver.launchTimeout = 150 }EOL----------------...

  • 1 kudos
9 More Replies
mo91
by New Contributor III
  • 3932 Views
  • 5 replies
  • 9 kudos

Resolved! Community edition - RestException: PERMISSION_DENIED: Model Registry is not enabled for organization 2183541758974102.

Currently running this cmmd:-model_name = "Quality"model_version = mlflow.register_model(f"runs:/{run_id}/random_forest_model", model_name)# Registering the model takes a few seconds, so add a small delaytime.sleep(15)however I get this error:-RestEx...

  • 3932 Views
  • 5 replies
  • 9 kudos
Latest Reply
Prabakar
Esteemed Contributor III
  • 9 kudos

@Martin Olowe​ There are certain limitations with the community edition and you do not have this feature there. To use this you need to go with the commercial version of Databricks as mentioned by @Hubert Dudek​ .

  • 9 kudos
4 More Replies
sarosh
by New Contributor
  • 6862 Views
  • 3 replies
  • 1 kudos

ModuleNotFoundError / SerializationError when executing over databricks-connect

I am running into the following error when I run a model fitting process over databricks-connect.It looks like worker nodes are unable to access modules from the project's parent directory. Note that the program runs successfully up to this point; n...

modulenotfoundanno
  • 6862 Views
  • 3 replies
  • 1 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 1 kudos

Hi @Sarosh Ahmad​ , Just a friendly follow-up. Do you still need help or the above responses help you to find the solution? Please let us know.

  • 1 kudos
2 More Replies
Bharat105
by New Contributor
  • 791 Views
  • 1 replies
  • 0 kudos

Resolved! Unable to complete signup

I am trying signup on databricks for my organization use . I am unable to complete as i am not receiving any mail.Please help ​

  • 791 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @Bharat Thakur​ , Please share your details on community@databricks.com.

  • 0 kudos
FRG96
by New Contributor III
  • 19334 Views
  • 6 replies
  • 7 kudos

Resolved! How to programmatically get the Spark Job ID of a running Spark Task?

In Spark we can get the Spark Application ID inside the Task programmatically using:SparkEnv.get.blockManager.conf.getAppIdand we can get the Stage ID and Task Attempt ID of the running Task using:TaskContext.get.stageId TaskContext.get.taskAttemptId...

  • 19334 Views
  • 6 replies
  • 7 kudos
Latest Reply
FRG96
New Contributor III
  • 7 kudos

Hi @Gaurav Rupnar​ , I have Spark SQL UDFs (implemented as Scala methods) in which I want to get the details of the Spark SQL query that called the UDF, especially a unique query ID, which in SparkSQL is the Spark Job ID. That's why I wanted a way to...

  • 7 kudos
5 More Replies
Thom
by New Contributor
  • 359 Views
  • 0 replies
  • 0 kudos

There seems to be missing lesson files in the repo I downloaded for the Data Engineering with Databricks course. The lesson Advanced SQL Transformati...

There seems to be missing lesson files in the repo I downloaded for the Data Engineering with Databricks course. The lesson Advanced SQL Transformations refers to files that aren't in the repo. One or two other lessons were missing as well.

  • 359 Views
  • 0 replies
  • 0 kudos
KC_1205
by New Contributor III
  • 2147 Views
  • 4 replies
  • 3 kudos

Resolved! NumPy update 1.18-1.21

Hi all,I am planning to update the DB to 9.1 LTS from 7.3 LTS, corresponding NumPy version will be 1.19 and later would like to update 1.21 in the notebooks. At cluster I have Spark version related to the 9.1 LTS which will support 1.19 and notebook ...

  • 2147 Views
  • 4 replies
  • 3 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 3 kudos

Hi @Kiran Chalasani​ , How are you doing? Do you still need help or you've solved your problem?

  • 3 kudos
3 More Replies
Lincoln_Bergeso
by New Contributor II
  • 6195 Views
  • 10 replies
  • 5 kudos

Resolved! How do I read the contents of a hidden file in a Spark job?

I'm trying to read a file from a Google Cloud Storage bucket. The filename starts with a period, so Spark assumes the file is hidden and won't let me read it.My code is similar to this:from pyspark.sql import SparkSession   spark = SparkSession.build...

  • 6195 Views
  • 10 replies
  • 5 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 5 kudos

Hi @Lincoln Bergeson​ , Did @Dan Zafar​ 's response help you solve your problem?

  • 5 kudos
9 More Replies
amil
by New Contributor
  • 658 Views
  • 1 replies
  • 0 kudos

Hi Kaniz Fatma, I have verification done successfully however the mail hasn&#39;t come to the mail. mail: ss4699@srmist.edu.in Kindly help. Regards,Si...

Hi Kaniz Fatma,I have verification done successfully however the mail hasn't come to the mail.mail: ss4699@srmist.edu.inKindly help.Regards,Siva

  • 658 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @sivabalan Selvaraj​ , Thank you for reaching out!Let us look into this for you, and we'll follow up with an update.

  • 0 kudos
amil
by New Contributor
  • 569 Views
  • 1 replies
  • 0 kudos

Hi Kaniz , I am unable to access data bricks Community edition ever after solving the puzzle. Mail : amilsivabalan@gmail.com Kindly help.  Regards,Siv...

Hi Kaniz ,I am unable to access data bricks Community edition ever after solving the puzzle.Mail : amilsivabalan@gmail.comKindly help.Regards,Sivabalan S

  • 569 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @sivabalan Selvaraj​ , Thank you for reaching out!Let us look into this for you, and we'll follow up with an update.

  • 0 kudos
LukaszJ
by Contributor III
  • 650 Views
  • 0 replies
  • 0 kudos

Real time query plotting

Hello,I have a table on Azure Databricks that I keep updating with the "A" notebook.And I want to real time plotting the query result from the table (let's say SELECT COUNT(name), name FROM my_schema.my_table GROUP BY name).I know about Azure Applica...

  • 650 Views
  • 0 replies
  • 0 kudos
LukaszJ
by Contributor III
  • 1416 Views
  • 3 replies
  • 1 kudos

Table access control cluster with R language

Hello,I want to have a high concurrency cluster with table access control and I want to use R language on it.I know that the documentation says that R and Scala is not available with table access control.But maybe you have some tricks or best practic...

  • 1416 Views
  • 3 replies
  • 1 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 1 kudos

Hi @Łukasz Jaremek​ , Just a friendly follow-up. Do you still need help, or @Aashita Ramteke​'s response help you to find the solution? Please let us know.

  • 1 kudos
2 More Replies
Vee
by New Contributor
  • 4290 Views
  • 2 replies
  • 1 kudos

Cluster configuration and optimal number for fs.s3a.connection.maximum , fs.s3a.threads.max

Please could you suggest best cluster configuration for a use case stated below and tips to resolve the errors shown below -Use case:There could be 4 or 5 spark jobs that run concurrently.Each job reads 40 input files and spits out 120 output files ...

  • 4290 Views
  • 2 replies
  • 1 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 1 kudos

Hi @Vetrivel Senthil​ , Just a friendly follow-up. Do you still need help? Please let us know.

  • 1 kudos
1 More Replies
samrachmiletter
by New Contributor III
  • 2595 Views
  • 4 replies
  • 5 kudos

Resolved! Is it possible to set order of precedence of spark SQL extensions?

I have the iceberg SQL extension installed, but running commands such as MERGE INTO result in the error pyspark.sql.utils.AnalysisException: MERGE destination only supports Delta sources.this seems to be due to using Delta's MERGE command as opposed ...

  • 2595 Views
  • 4 replies
  • 5 kudos
Latest Reply
samrachmiletter
New Contributor III
  • 5 kudos

This does help. I tried going through the DataFrameReader as well but ran into the same error, so it seems it is indeed not possible. Thank you @Hubert Dudek​!

  • 5 kudos
3 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels