Data Engineering

Forum Posts

Sorted by:

by brendanc19 • New Contributor III

03-07-2023 6:51:38 AM

4407 Views
6 replies
2 kudos

Resolved! Does cancelling a job run rollback any actions performed by query plan?

If I were to stop a rather large job run, say half way thru execution, will any actions performed on our Delta tables persist or will they be rolled back?Are there any other risks that I need to be aware of in terms of cancelling a job run half way t...

Data Engineering

4407 Views
6 replies
2 kudos

03-07-2023 6:51:38 AM

View Replies

Latest Reply

fabian_r
New Contributor II

12-03-2024 5:26:59 AM

2 kudos

Hi, is there any way to ensure transaction control in delta protocol in 2024 across tables for failing jobs?

2 kudos

12-03-2024 5:26:59 AM

5 More Replies

by hanish • New Contributor II

01-25-2023 1:22:35 AM

3633 Views
5 replies
2 kudos

Job cluster support in jobs/runs/submit API

We are using jobs/runs/submit API of databricks to create and trigger a one-time run with new_cluster and existing_cluster configuration. We would like to check if there is provision to pass "job_clusters" in this API to reuse the same cluster across...

Data Engineering

3633 Views
5 replies
2 kudos

01-25-2023 1:22:35 AM

View Replies

Latest Reply

Nagrjuna
New Contributor II

09-03-2024 1:15:37 PM

2 kudos

Hi, Any update on the above mentioned issue? Unable to submit a one time new job run (api/2.0 or 21/jobs/runs/submit) with shared job cluster or one new cluster has to be used for all TASKs in the job

2 kudos

09-03-2024 1:15:37 PM

4 More Replies

by TylerTamasaucka • New Contributor II

11-18-2019 12:59:11 PM

29150 Views
5 replies
2 kudos

org.apache.spark.sql.AnalysisException: Undefined function: 'MAX'

I am trying to create a JAR for a Azure Databricks job but some code that works when using the notebook interface does not work when calling the library through a job. The weird part is that the job will complete the first run successfully but on an...

Data Engineering

29150 Views
5 replies
2 kudos

11-18-2019 12:59:11 PM

View Replies

Latest Reply

skaja
New Contributor II

10-12-2022 12:57:55 AM

2 kudos

I am facing similar issue when trying to use from_utc_timestamp function. I am able to call the function from databricks notebook but when I use the same function inside my java jar and running as a job in databricks, it is giving below error. Analys...

2 kudos

10-12-2022 12:57:55 AM

4 More Replies

by GGG_P • New Contributor III

04-28-2023 11:45:13 AM

6096 Views
3 replies
0 kudos

Databricks Tasks Python wheel : How access to JobID & runID ?

I'm using Python (as Python wheel application) on Databricks.I deploy & run my jobs using dbx.I defined some Databricks Workflow using Python wheel tasks.Everything is working fine, but I'm having issue to extract "databricks_job_id" & "databricks_ru...

Data Engineering

6096 Views
3 replies
0 kudos

04-28-2023 11:45:13 AM

View Replies

Latest Reply

AndréSalvati
New Contributor III

03-06-2024 8:54:17 AM

0 kudos

There you can see a complete template project with Databricks Asset Bundles and python wheel task. Please, follow the instructions for deployment.https://github.com/andre-salvati/databricks-templateIn particular, take a look at the workflow definitio...

0 kudos

03-06-2024 8:54:17 AM

2 More Replies

by lstk • New Contributor

04-26-2022 8:47:18 AM

2883 Views
2 replies
1 kudos

Resolved! Job ID value out of range - Azure Logic App Connector

Hello everybody,i tried to build a Logic App Custom Connector following this one explanation. (https://medium.com/@poojaanilshinde/create-azure-logic-apps-custom-connector-for-azure-databricks-e51f4524ab27)Now i run in the following Problem and wante...

Data Engineering

2883 Views
2 replies
1 kudos

04-26-2022 8:47:18 AM

View Replies

Latest Reply

stefnhuy
New Contributor III

10-20-2023 1:43:41 AM

1 kudos

Hey Lukas,I can totally relate to the frustration of encountering those confounding errors when building custom connectors in Azure Logic Apps. The "Job ID value out of range" issue can be quite perplexing, but fear not, for there's a solution on the...

1 kudos

10-20-2023 1:43:41 AM

1 More Replies

by User16790091296 • Contributor II

06-24-2021 8:52:27 AM

3940 Views
1 replies
0 kudos

How to create a databricks job with parameters via CLI?

I'm creating a new job in databricks using the databricks-cli:databricks jobs create --json-file ./deploy/databricks/config/job.config.jsonWith the following json:{ "name": "Job Name", "new_cluster": { "spark_version": "4.1.x-scala2.1...

Data Engineering

3940 Views
1 replies
0 kudos

06-24-2021 8:52:27 AM

View Replies

Latest Reply

matthew_m
Databricks Employee

10-12-2023 9:37:24 AM

0 kudos

This is an old post but still relevant for future readers, so will answer how it is done. You need to add base_parameters flag in the notebook_task config, like the following. "notebook_task": { "notebook_path": "...", "base_parameters": { ...

0 kudos

10-12-2023 9:37:24 AM

by fuselessmatt • Contributor

04-03-2023 6:34:19 AM

6799 Views
2 replies
1 kudos

Can assign a default value for job parameter from the widget?

The Databricks widget (dbutils) provides the get function for accessing the job parameters of a job.dbutils.widgets.get('my_param')Unlike Python dict, where get returns None or an optional argument if the dict doesn't contain the parameter, the widg...

Data Engineering

6799 Views
2 replies
1 kudos

04-03-2023 6:34:19 AM

View Replies

Latest Reply

Anonymous
Not applicable

04-04-2023 11:39:32 PM

1 kudos

Hi @Mattias P Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers you...

1 kudos

04-04-2023 11:39:32 PM

1 More Replies

by TheRealJimShady • New Contributor

03-30-2023 8:16:36 AM

10576 Views
7 replies
0 kudos

Resolved! Email destination not appearing in Job's System Notification list.

On job failure I need to send an email with a custom subject line. I have configured the email address as a destination with the subject that I need, but I don't see it as an option that I can choose in the 'System Notification' dialog in the job set...

Data Engineering

10576 Views
7 replies
0 kudos

03-30-2023 8:16:36 AM

View Replies

Latest Reply

Anonymous
Not applicable

03-31-2023 7:18:22 PM

0 kudos

Hi @James Smith Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so w...

0 kudos

03-31-2023 7:18:22 PM

6 More Replies

by essentialDatabr • New Contributor II

02-23-2023 1:52:14 AM

3279 Views
1 replies
1 kudos

Confusion about {{run_id}} and {{parent_run_id}} variables for Databricks jobs (Azure)

In Databricks jobs on Azure you can use the {{run_id}} and {{parent_run_id}}variables for a specific run: https://docs.databricks.com/workflows/jobs/jobs.htmlFor Databricks jobs with only two or more tasks, then {{run_id}} seems to correspond to task...

Data Engineering

3279 Views
1 replies
1 kudos

02-23-2023 1:52:14 AM

View Replies

Latest Reply

Anonymous
Not applicable

03-31-2023 8:44:41 AM

1 kudos

@Kasper H :Yes, you are correct in your understanding that in Databricks jobs with multiple tasks, the {{run_id}} variable corresponds to the task_run_id and the {{parent_run_id}} variable corresponds to the job_run_id.For Databricks jobs with only ...

1 kudos

03-31-2023 8:44:41 AM

by RajeshRK • Contributor II

03-17-2023 4:29:09 AM

4696 Views
7 replies
2 kudos

How to optimize jobs performance

Hi Team,We have a complex ETL job running in databricks for 6 hours. The cluster has the below configuration: Minworkers: 16Maxworkers: 24Worker and Driver Node Type: Standard_DS14_v2. (16 cores, 128 GB RAM)I have monitored the job progress in Spark...

Data Engineering

4696 Views
7 replies
2 kudos

03-17-2023 4:29:09 AM

View Replies

Latest Reply

Anonymous
Not applicable

03-17-2023 11:08:46 PM

2 kudos

Hi @Rajesh Kannan R Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedb...

2 kudos

03-17-2023 11:08:46 PM

6 More Replies

by Therdpong • New Contributor III

01-18-2023 8:22:41 AM

2013 Views
2 replies
0 kudos

how to check what jobs cluster to have expanddisk.

We would like to know how to check what jobs cluster to have to expand disk.

Data Engineering

2013 Views
2 replies
0 kudos

01-18-2023 8:22:41 AM

View Replies

Latest Reply

jose_gonzalez
Databricks Employee

01-30-2023 2:40:04 PM

0 kudos

You can check in the cluster's event logs. You can type in the search box, "disk" and you will see all the events there.

0 kudos

01-30-2023 2:40:04 PM

1 More Replies

by successhawk • New Contributor II

12-11-2022 8:31:06 PM

2567 Views
3 replies
2 kudos

Resolved! Is there a way to tell if a created job is not compliant against configured cluster policies before it runs?

As a DevOps engineer, I want to enforce cluster policies at deployment time when the job is deployed/created, well before it is time to actually use it (i.e. before its scheduled/triggered run time without actually running it).

Data Engineering

2567 Views
3 replies
2 kudos

12-11-2022 8:31:06 PM

View Replies

Latest Reply

irfanaziz
Contributor II

12-12-2022 12:56:16 AM

2 kudos

Is it not the linked service that defines the kind of cluster created or used for any job?So i believe you could control the configuration via the linked service settings.

2 kudos

12-12-2022 12:56:16 AM

2 More Replies

by Pragat • New Contributor

07-18-2022 4:49:25 AM

1353 Views
1 replies
0 kudos

Databricks job parameterization

I am configuring an Databricks jobs using multiple notebooks having dependency with each other. All the notebooks are parameterized and using similiar parameters. How can i configure the parameter on global level so that all the notebooks can consume...

Data Engineering

1353 Views
1 replies
0 kudos

07-18-2022 4:49:25 AM

View Replies

Latest Reply

Aviral-Bhardwaj
Esteemed Contributor III

12-10-2022 6:28:11 AM

0 kudos

actually, it is very hard but if you want to use an alternative option you have to change your code and use a widget feature of data bricks.May be this is not a right option but you can still explore this doc for testing purpose https://docs.databric...

0 kudos

12-10-2022 6:28:11 AM

by alhuelamo • New Contributor II

12-07-2022 8:14:35 AM

8530 Views
4 replies
1 kudos

Getting non-traceable NullPointerExceptions

We're running a job that's issuing NullPointerException without traces of our job's code.Does anybody know what would be the best course of action when it comes to debugging these issues?The job is a Scala job running on DBR 11.3 LTS.In case it's rel...

Data Engineering

8530 Views
4 replies
1 kudos

12-07-2022 8:14:35 AM

View Replies

Latest Reply

UmaMahesh1
Honored Contributor III

12-07-2022 9:15:21 AM

1 kudos

NullPointerException will occur when you are accessing an instance method or if you are trying to access elements in a null array or you are calling a method on an object referred by null value. To give you suggestion on how to avoid that, we might ...

1 kudos

12-07-2022 9:15:21 AM

3 More Replies

by mr_poola49 • New Contributor III

11-22-2022 4:16:56 AM

2075 Views
0 replies
5 kudos

Azure Databricks Jobs Connection Timeout (Read Failed)

Azure Databricks Jobs failed intermittently due to connection timeout (Read Failed) while executing a MS SQL stored procedure which is in Azure SQL database.My requirement is to process delta records(Get delta records using last refresh date) from Da...

Data Engineering

2075 Views
0 replies
5 kudos

11-22-2022 4:16:56 AM