cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Tahseen0354
by Valued Contributor
  • 4482 Views
  • 4 replies
  • 3 kudos

Resolved! Why I am not receiving any mail sent to the Azure AD Group mailbox when databricks job fails ?

I have created an Azure AD Group in "Microsoft 365" type with its own email address, which being added to the Notification of a Databricks Job (on failure). But there is no mail sent to the Azure Group mailbox when the job fails.I am able to send a d...

  • 4482 Views
  • 4 replies
  • 3 kudos
Latest Reply
SalmanDB2024
New Contributor II
  • 3 kudos

Hey Guide,Appreciate your quick response. Actually I am using the GCP Databricks and the scope of groups i am not bothered about. The thing is I as an user of the job am unable to receive the email. Groups are not configured as of now.Need help regar...

  • 3 kudos
3 More Replies
Mohit_m
by Valued Contributor II
  • 23976 Views
  • 3 replies
  • 4 kudos

Resolved! How to get the Job ID and Run ID and save into a database

We are having Databricks Job running with main class and JAR file in it. Our JAR file code base is in Scala. Now, when our job starts running, we need to log Job ID and Run ID into the database for future purpose. How can we achieve this?

  • 23976 Views
  • 3 replies
  • 4 kudos
Latest Reply
Bruno-Castro
New Contributor II
  • 4 kudos

That article is for members only, can we also specify here how to do it (for those that are not medium members?). Thanks!

  • 4 kudos
2 More Replies
sandeep91
by New Contributor III
  • 7473 Views
  • 5 replies
  • 2 kudos

Resolved! Databricks Job: Package Name and EntryPoint parameters for the Python Wheel file

I have created Python wheel file with simple file structure and uploaded into cluster library and was able to run the packages in Notebook but, when I am trying to create a Job using python wheel and provide the package name and run the task it fails...

image
  • 7473 Views
  • 5 replies
  • 2 kudos
Latest Reply
AndréSalvati
New Contributor III
  • 2 kudos

There you can see a complete template project with (the new!!!) Databricks Asset Bundles tool and a python wheel task. Please, follow the instructions for deployment.https://github.com/andre-salvati/databricks-template

  • 2 kudos
4 More Replies
User16790091296
by Contributor II
  • 3297 Views
  • 1 replies
  • 0 kudos

How to create a databricks job with parameters via CLI?

I'm creating a new job in databricks using the databricks-cli:databricks jobs create --json-file ./deploy/databricks/config/job.config.jsonWith the following json:{ "name": "Job Name", "new_cluster": { "spark_version": "4.1.x-scala2.1...

  • 3297 Views
  • 1 replies
  • 0 kudos
Latest Reply
matthew_m
Databricks Employee
  • 0 kudos

This is an old post but still relevant for future readers, so will answer how it is done. You need to add base_parameters flag in the notebook_task config, like the following.   "notebook_task": { "notebook_path": "...", "base_parameters": { ...

  • 0 kudos
LidorAbo
by New Contributor II
  • 6535 Views
  • 1 replies
  • 1 kudos

bucket ownership of s3 bucket in databricks

We had a databricks job that has strange behavior,when we passing 'output_path' to function saveAsTextFile and not output_path variable the data saved to the following path: s3://dev-databricks-hy1-rootbucket/nvirginiaprod/3219117805926709/output_pa...

s3
  • 6535 Views
  • 1 replies
  • 1 kudos
Latest Reply
User16752239289
Databricks Employee
  • 1 kudos

I suspect you provided a dbfs path to save the data hence the data saved under your workspace root bucket.For the workspace root bucket, databricks workspace will interact with databricks credential to make sure databricks has access to it and able t...

  • 1 kudos
Divya_Bhadauria
by New Contributor II
  • 6180 Views
  • 2 replies
  • 2 kudos

Running databricks job with different parameter automatically

I have a python script running as databricks job. Is there a way I can run this job with different set of parameters automatically or programmatically without using run with different parameter option available in UI ?

  • 6180 Views
  • 2 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Divya Bhadauria​ We haven't heard from you since the last response from @Lakshay Goel​ â€‹, and I was checking back to see if her suggestions helped you.Or else, If you have any solution, please share it with the community, as it can be helpful to ...

  • 2 kudos
1 More Replies
source2sea
by Contributor
  • 5029 Views
  • 4 replies
  • 2 kudos

Resolved! how to make databricks job to fail when the application has already given "exit code 1"?

object OurMainObject extends LazyLogging with IOApp { def run(args: List[String]): IO[ExitCode] = { logger.info("Started the application")   val conf = defaultOverrides.withFallback(defaultApplication).withFallback(defaultReference) val...

  • 5029 Views
  • 4 replies
  • 2 kudos
Latest Reply
source2sea
Contributor
  • 2 kudos

my workaround now is to make the code like below, so the databricks jobs becomes failure. case Left(ex) => { IO(logger.error("Glue failure", ex)).map(_ => ExitCode.Error) IO.raiseError(ex) }

  • 2 kudos
3 More Replies
psps
by New Contributor III
  • 4388 Views
  • 3 replies
  • 4 kudos

Databricks Job run logs only shows prints/logs from driver and not executors

Hi,​In Databricks Job run output, only logs from driver are displayed. We have a function parallelized to run on executor nodes. The logs/prints from that function are not displayed in job run output. Is there a way to configure and show those logs i...

  • 4388 Views
  • 3 replies
  • 4 kudos
Latest Reply
psps
New Contributor III
  • 4 kudos

Thanks @Debayan Mukherjee​ . This is to enable executor logging. However, the executor logs do not appear in Databricks Job run output. Only driver logs are displayed.

  • 4 kudos
2 More Replies
Divya_Bhadauria
by New Contributor II
  • 9354 Views
  • 3 replies
  • 2 kudos

Unable to run python script from git repo in Databricks job

I'm getting cannot read python file on running this job which is configured to run a python script from git repo. Run result unavailable: run failed with error message Cannot read the python file /Repos/.internal/7c39d645692_commits/ff669d089cd8f93e9...

  • 9354 Views
  • 3 replies
  • 2 kudos
Latest Reply
Divya_Bhadauria
New Contributor II
  • 2 kudos

Hi Vidula,Yes, the above solution worked out for me. Tried debugging using all of the above steps and it turned out the path I was using in the job config was incorrect.

  • 2 kudos
2 More Replies
MarsSu
by New Contributor II
  • 8961 Views
  • 5 replies
  • 1 kudos

Resolved! Databricks job about spark structured streaming zero downtime deployment in terraform.

I would like to ask how to implement zero downtime deployment of spark structured streaming in databricks job compute with terraform. Because we will upgrade spark application code version. But currently we found every deployment will cancel original...

  • 8961 Views
  • 5 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

@Mars Su​ :Yes, you can implement zero downtime deployment of Spark Structured Streaming in Databricks job compute using Terraform. One way to achieve this is by using Databricks' "job clusters" feature, which allows you to create a cluster specifica...

  • 1 kudos
4 More Replies
Michael_Papadop
by New Contributor II
  • 10723 Views
  • 3 replies
  • 0 kudos

How can I set the status of a databricks job as skipped via python?

I have a basic 2 task job. The 1st notebook (task) checks whether the source file has changes and if so then refreshes a corresponding materialized view. In case we have no changes then I use dbutils.jobs.taskValues.set(key = "skip_job", value = 1) &...

  • 10723 Views
  • 3 replies
  • 0 kudos
Latest Reply
karthik_p
Esteemed Contributor
  • 0 kudos

@Michael Papadopoulos​ usually that should not be the case i think, as for task level we have 3 level notifications ( success, failure,start), where as whole job level skip option is available to discard notification . will see if some one from commu...

  • 0 kudos
2 More Replies
youssefmrini
by Databricks Employee
  • 1129 Views
  • 1 replies
  • 1 kudos
  • 1129 Views
  • 1 replies
  • 1 kudos
Latest Reply
youssefmrini
Databricks Employee
  • 1 kudos

You can ensure there is always an active run of your Databricks job with the new continuous trigger type. https://docs.databricks.com/workflows/jobs/jobs.html#continuous-jobs

  • 1 kudos
vinaykumar
by New Contributor III
  • 4345 Views
  • 3 replies
  • 1 kudos

Resolved! Run databricks job instantly without waiting job cluster get active

when we run databricks job it take some time to get job cluster active . I created pool also and attached with job cluster but still it take time to attached the cluster and job cluster get active to start the job run. is there any way - we can run d...

  • 4345 Views
  • 3 replies
  • 1 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 1 kudos

If you want instant processing, you will have to have a cluster running all the time.As mentioned above, Databricks is testing serverless compute for data engineering workloads (comparable to serverless SQL). This fires up a cluster in a few seconds...

  • 1 kudos
2 More Replies
joakon
by New Contributor III
  • 2739 Views
  • 4 replies
  • 4 kudos

Resolved! Databricks - Workflow- Jobs- Script to automate

Hi - I have created a Databricks job - under Workflow - its running fine without any issues . I would like to promote this job to other workspaces using a script.Is there a way to script the job definition and deploy it across multiple workspaces .I ...

  • 2739 Views
  • 4 replies
  • 4 kudos
Latest Reply
joakon
New Contributor III
  • 4 kudos

thank you @Landan George​ 

  • 4 kudos
3 More Replies
Bartek
by Contributor
  • 2998 Views
  • 0 replies
  • 1 kudos

How to pass all dag_run.conf parameters to python_wheel_task

I want to trigger Databricks job from Airflow using DatabricksSubmitRunDeferrableOperator and I need to pass configuration params. Here is excerpt from my code (definition is not complete, only crucial properties):from airflow.providers.databricks.op...

  • 2998 Views
  • 0 replies
  • 1 kudos
Labels