cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Pras1
by New Contributor II
  • 10153 Views
  • 2 replies
  • 2 kudos

Resolved! AZURE_QUOTA_EXCEEDED_EXCEPTION - even with more than vCPUs than Databricks recommends

I am running this Delta Live Tables PoC from databricks-industry-solutions/industry-solutions-blueprintshttps://github.com/databricks-industry-solutions/pos-dltI have Standard_DS4_v2 with 28GB and 8 cores x 2 workers - so a total of 16 cores. This is...

  • 10153 Views
  • 2 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Prasenjit Biswas​ We haven't heard from you since the last response from @Jose Gonzalez​ â€‹ . Kindly share the information with us, and in return, we will provide you with the necessary solution.Thanks and Regards

  • 2 kudos
1 More Replies
Sas
by New Contributor II
  • 2531 Views
  • 1 replies
  • 0 kudos

A streaming job going into infinite looping

HiBelow i am trying to read data from kafka, determine whether its fraud or not and then i need to write it back to mongodbbelow is my code read_kafka.pyfrom pyspark.sql import SparkSession from pyspark.sql.functions import * from pyspark.sql.types i...

  • 2531 Views
  • 1 replies
  • 0 kudos
Latest Reply
swethaNandan
Databricks Employee
  • 0 kudos

Hi Saswata,Can you remove the filter and see if it is printing output to console?kafka_df5=kafka_df4.filter(kafka_df4.status=="FRAUD")Thanks and RegardsSwetha Nandajan

  • 0 kudos
Taha_Hussain
by Databricks Employee
  • 8068 Views
  • 5 replies
  • 8 kudos

Ask your technical questions at Databricks Office Hours! Register here for any of our upcoming dates:May 10 - 11:00 AM - 12:00 PM PTMay 17 - 8:00 AM -...

Ask your technical questions at Databricks Office Hours! Register here for any of our upcoming dates:May 10 - 11:00 AM - 12:00 PM PTMay 17 - 8:00 AM - 9:00 AM PTMay 24 - 9:00 AM - 10:00 AM GMTDatabricks Office Hours connects you directly with experts...

  • 8068 Views
  • 5 replies
  • 8 kudos
Latest Reply
Priyag1
Honored Contributor II
  • 8 kudos

Thanks for this info

  • 8 kudos
4 More Replies
harraz
by New Contributor III
  • 8427 Views
  • 1 replies
  • 0 kudos

Run result unavailable: run failed with error message Notebook not found:

I'm trying to create a workflow job that fetches the notebook from a remote git repository (Bitbucket cloud)I tried everything in the Path field and nothing is working. Note that the bitbucket repo is connected to databricks already and no issues che...

Screen Shot 2023-05-31 at 6.45.47 PM
  • 8427 Views
  • 1 replies
  • 0 kudos
Latest Reply
Debayan
Databricks Employee
  • 0 kudos

Hi @harraz (Customer)​ , Could you please confirm if files in repos has been enabled? https://docs.databricks.com/files/workspace.html#configure-support-for-files-in-repos.You can use the command %sh pwd in a notebook inside a repo to check if Files ...

  • 0 kudos
deep_thought
by Contributor
  • 24955 Views
  • 16 replies
  • 9 kudos

Resolved! Schedule job to run sequentially after another job

Is there a way to schedule a job to run after some other job is complete?E.g. Schedule Job A, then upon it's completion run Job B.

  • 24955 Views
  • 16 replies
  • 9 kudos
Latest Reply
claytonseverson
Databricks Employee
  • 9 kudos

Here is the User Guide for Jobs-as-Tasks - https://docs.google.com/document/d/1OJsc-g7IwAJjYooCp7T01Rxyt_xFkMPjmAAGdDGPkY4/edit#heading=h.oudvb5fyfd0n

  • 9 kudos
15 More Replies
MarsSu
by New Contributor II
  • 4610 Views
  • 3 replies
  • 3 kudos

Resolved! Does driver node of job compute have HA?

I would like to confirm and discuss HA mechanism about driver node of job compute. Because we can image driver node just like master node of cluster. In AWS EMR, we can setup 2 master node so that one of master node failed, another master node can re...

  • 4610 Views
  • 3 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Mars Su​ We haven't heard from you since the last response from @Werner Stinckens​ and @karthik p​ â€‹, and I was checking back to see if her suggestions helped you.Or else, If you have any solution, please share it with the community, as it can be...

  • 3 kudos
2 More Replies
JordanYaker
by Contributor
  • 3995 Views
  • 1 replies
  • 0 kudos

Batch Doesn't Exist Failure

I have a job that's been working perfectly fine since I deployed it earlier this month. Last night, however, one of the tasks within the job started failing with the following error:java.lang.IllegalStateException: batch 4 doesn't exist at org.apac...

  • 3995 Views
  • 1 replies
  • 0 kudos
Latest Reply
JordanYaker
Contributor
  • 0 kudos

I tried FSCK REPAIR just on the chance that it would work and it had no effect.

  • 0 kudos
psps
by New Contributor III
  • 5520 Views
  • 3 replies
  • 5 kudos

Databricks Job run logs only shows prints/logs from driver and not executors

Hi,​In Databricks Job run output, only logs from driver are displayed. We have a function parallelized to run on executor nodes. The logs/prints from that function are not displayed in job run output. Is there a way to configure and show those logs i...

  • 5520 Views
  • 3 replies
  • 5 kudos
Latest Reply
psps
New Contributor III
  • 5 kudos

Thanks @Debayan Mukherjee​ . This is to enable executor logging. However, the executor logs do not appear in Databricks Job run output. Only driver logs are displayed.

  • 5 kudos
2 More Replies
B_J_Innov
by New Contributor III
  • 8731 Views
  • 12 replies
  • 0 kudos

Resolved! Can't use job cluster for scheduled jobs ADD_NODES_FAILED : Failed to add 9 containers to the cluster. Will attempt retry: false. Reason: Azure Quota Exceeded Exception

Hi everyone,I've been using my all purpose cluster for scheduled jobs and I've been told that it's a suboptimal thing to do and that using a job cluster for the scheduled jobs cuts costs by half.Unfortunately, when I tried to switch clusters on my ex...

  • 8731 Views
  • 12 replies
  • 0 kudos
Latest Reply
karthik_p
Esteemed Contributor
  • 0 kudos

@Bassem Jaber​ If you are seeing same error then you need to increase quota, for that your azure plan should be changed from pay as you go to other plan. as pay-as-go azure model has limitations on quota increase

  • 0 kudos
11 More Replies
oleole
by Contributor
  • 6224 Views
  • 3 replies
  • 3 kudos

Resolved! How to delay a new job run after job

I have a daily job run that occasionally fails with the error: The spark driver has stopped unexpectedly and is restarting. Your notebook will be automatically reattached. After I get the notification that this job failed on schedule, I manually run ...

image.png image.png
  • 6224 Views
  • 3 replies
  • 3 kudos
Latest Reply
oleole
Contributor
  • 3 kudos

According to this documentation, you can specify the wait time between the "start" of the first run and the retry start time.

  • 3 kudos
2 More Replies
Michael_Papadop
by New Contributor II
  • 12095 Views
  • 3 replies
  • 0 kudos

How can I set the status of a databricks job as skipped via python?

I have a basic 2 task job. The 1st notebook (task) checks whether the source file has changes and if so then refreshes a corresponding materialized view. In case we have no changes then I use dbutils.jobs.taskValues.set(key = "skip_job", value = 1) &...

  • 12095 Views
  • 3 replies
  • 0 kudos
Latest Reply
karthik_p
Esteemed Contributor
  • 0 kudos

@Michael Papadopoulos​ usually that should not be the case i think, as for task level we have 3 level notifications ( success, failure,start), where as whole job level skip option is available to discard notification . will see if some one from commu...

  • 0 kudos
2 More Replies
jakubk
by Contributor
  • 12131 Views
  • 13 replies
  • 9 kudos

dbt workflow job limitations - naming the target? where do docs go?

I'm on unity catalogI'm trying to do a dbt run on a project that works locallybut the databricks dbt workflow task seems to be ignoring the project.yml settings for schemas and catalogs, as well as that defined in the config block of individual model...

  • 12131 Views
  • 13 replies
  • 9 kudos
Latest Reply
Anonymous
Not applicable
  • 9 kudos

Hi @Jakub K​ I'm sorry you could not find a solution to your problem in the answers provided.Our community strives to provide helpful and accurate information, but sometimes an immediate solution may only be available for some issues.I suggest provid...

  • 9 kudos
12 More Replies
Tjadi
by New Contributor III
  • 2104 Views
  • 2 replies
  • 4 kudos

Specifying cluster on running a job

Hi,Let's say that I am starting jobs with different parameters at a certain time each day in the following manner:response = requests.post( "https://%s/api/2.0/jobs/run-now" % (DOMAIN), headers={"Authorization": "Bearer %s" % TOKEN}, json={ ...

  • 2104 Views
  • 2 replies
  • 4 kudos
Latest Reply
karthik_p
Esteemed Contributor
  • 4 kudos

@Tjadi Peeters​ You can select option Autoscaling/Enhanced Scaling in workflows which will scale based on workload

  • 4 kudos
1 More Replies
Saty
by New Contributor
  • 16433 Views
  • 3 replies
  • 1 kudos

Job is fails with java.lang.NoClassDefFoundError: Could not initialize class error

hi,It is scala code where we are connecting Redis to store (sparkcontext.toRedisKV) and i am also using scala udf . ihave excuted the same code in notebook without scala object and it works fine but everytime it fails when i am using same code in jar...

  • 16433 Views
  • 3 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Satish Kumbhar​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Tha...

  • 1 kudos
2 More Replies
yzhang
by New Contributor III
  • 3340 Views
  • 5 replies
  • 0 kudos

Cannot find such info if Databricks supports nested jobs or tasks. For example, I have a 'job_a', which contains list of tasks, and another &#...

Cannot find such info if Databricks supports nested jobs or tasks. For example, I have a 'job_a', which contains list of tasks, and another 'job_b', also contains a list of tasks. Now I'd like to have a 'job_all' that will run both 'job_a' and 'job_b...

  • 3340 Views
  • 5 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Yanan Zhang​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the response and select the one that best answers yo...

  • 0 kudos
4 More Replies
Labels