cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

brendanc19
by New Contributor III
  • 4054 Views
  • 6 replies
  • 2 kudos

Resolved! Does cancelling a job run rollback any actions performed by query plan?

If I were to stop a rather large job run, say half way thru execution, will any actions performed on our Delta tables persist or will they be rolled back?Are there any other risks that I need to be aware of in terms of cancelling a job run half way t...

  • 4054 Views
  • 6 replies
  • 2 kudos
Latest Reply
fabian_r
New Contributor II
  • 2 kudos

Hi, is there any way to ensure transaction control in delta protocol in 2024 across tables for failing jobs?

  • 2 kudos
5 More Replies
hanish
by New Contributor II
  • 3330 Views
  • 5 replies
  • 2 kudos

Job cluster support in jobs/runs/submit API

We are using jobs/runs/submit API of databricks to create and trigger a one-time run with new_cluster and existing_cluster configuration. We would like to check if there is provision to pass "job_clusters" in this API to reuse the same cluster across...

  • 3330 Views
  • 5 replies
  • 2 kudos
Latest Reply
Nagrjuna
New Contributor II
  • 2 kudos

Hi, Any update on the above mentioned issue? Unable to submit a one time new job run (api/2.0 or 21/jobs/runs/submit) with shared job cluster or one new cluster has to be used for all TASKs in the job 

  • 2 kudos
4 More Replies
TylerTamasaucka
by New Contributor II
  • 28599 Views
  • 5 replies
  • 2 kudos

org.apache.spark.sql.AnalysisException: Undefined function: 'MAX'

I am trying to create a JAR for a Azure Databricks job but some code that works when using the notebook interface does not work when calling the library through a job. The weird part is that the job will complete the first run successfully but on an...

  • 28599 Views
  • 5 replies
  • 2 kudos
Latest Reply
skaja
New Contributor II
  • 2 kudos

I am facing similar issue when trying to use from_utc_timestamp function. I am able to call the function from databricks notebook but when I use the same function inside my java jar and running as a job in databricks, it is giving below error. Analys...

  • 2 kudos
4 More Replies
GGG_P
by New Contributor III
  • 5370 Views
  • 3 replies
  • 0 kudos

Databricks Tasks Python wheel : How access to JobID & runID ?

I'm using Python (as Python wheel application) on Databricks.I deploy & run my jobs using dbx.I defined some Databricks Workflow using Python wheel tasks.Everything is working fine, but I'm having issue to extract "databricks_job_id" & "databricks_ru...

  • 5370 Views
  • 3 replies
  • 0 kudos
Latest Reply
AndréSalvati
New Contributor III
  • 0 kudos

There you can see a complete template project with Databricks Asset Bundles and python wheel task. Please, follow the instructions for deployment.https://github.com/andre-salvati/databricks-templateIn particular, take a look at the workflow definitio...

  • 0 kudos
2 More Replies
lstk
by New Contributor
  • 2648 Views
  • 2 replies
  • 1 kudos

Resolved! Job ID value out of range - Azure Logic App Connector

Hello everybody,i tried to build a Logic App Custom Connector following this one explanation. (https://medium.com/@poojaanilshinde/create-azure-logic-apps-custom-connector-for-azure-databricks-e51f4524ab27)Now i run in the following Problem and wante...

image.png
  • 2648 Views
  • 2 replies
  • 1 kudos
Latest Reply
stefnhuy
New Contributor III
  • 1 kudos

Hey Lukas,I can totally relate to the frustration of encountering those confounding errors when building custom connectors in Azure Logic Apps. The "Job ID value out of range" issue can be quite perplexing, but fear not, for there's a solution on the...

  • 1 kudos
1 More Replies
User16790091296
by Contributor II
  • 3599 Views
  • 1 replies
  • 0 kudos

How to create a databricks job with parameters via CLI?

I'm creating a new job in databricks using the databricks-cli:databricks jobs create --json-file ./deploy/databricks/config/job.config.jsonWith the following json:{ "name": "Job Name", "new_cluster": { "spark_version": "4.1.x-scala2.1...

  • 3599 Views
  • 1 replies
  • 0 kudos
Latest Reply
matthew_m
Databricks Employee
  • 0 kudos

This is an old post but still relevant for future readers, so will answer how it is done. You need to add base_parameters flag in the notebook_task config, like the following.   "notebook_task": { "notebook_path": "...", "base_parameters": { ...

  • 0 kudos
fuselessmatt
by Contributor
  • 6269 Views
  • 2 replies
  • 1 kudos

Can assign a default value for job parameter from the widget?

The Databricks widget (dbutils) provides the get function for accessing the job parameters of a job.​dbutils.widgets.get('my_param')Unlike Python dict, where get returns None or an optional argument if the dict doesn't contain the parameter, the widg...

  • 6269 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Mattias P​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers you...

  • 1 kudos
1 More Replies
TheRealJimShady
by New Contributor
  • 10166 Views
  • 7 replies
  • 0 kudos

Resolved! Email destination not appearing in Job's System Notification list.

On job failure I need to send an email with a custom subject line. I have configured the email address as a destination with the subject that I need, but I don't see it as an option that I can choose in the 'System Notification' dialog in the job set...

Screenshot 2023-03-30 161113 Screenshot 2023-03-30 161231
  • 10166 Views
  • 7 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @James Smith​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so w...

  • 0 kudos
6 More Replies
essentialDatabr
by New Contributor II
  • 2923 Views
  • 1 replies
  • 1 kudos

Confusion about {{run_id}} and {{parent_run_id}} variables for Databricks jobs (Azure)

In Databricks jobs on Azure you can use the {{run_id}} and {{parent_run_id}}variables for a specific run: https://docs.databricks.com/workflows/jobs/jobs.htmlFor Databricks jobs with only two or more tasks, then {{run_id}} seems to correspond to task...

  • 2923 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

@Kasper H​ :Yes, you are correct in your understanding that in Databricks jobs with multiple tasks, the {{run_id}} variable corresponds to the task_run_id and the {{parent_run_id}} variable corresponds to the job_run_id.For Databricks jobs with only ...

  • 1 kudos
RajeshRK
by Contributor II
  • 4313 Views
  • 7 replies
  • 2 kudos

How to optimize jobs performance

Hi Team,We have a complex ETL job running in databricks for 6 hours. The cluster has the below configuration: Minworkers: 16Maxworkers: 24Worker and Driver Node Type: Standard_DS14_v2. (16 cores, 128 GB RAM)I have monitored the job progress in Spark...

  • 4313 Views
  • 7 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Rajesh Kannan R​ Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedb...

  • 2 kudos
6 More Replies
Therdpong
by New Contributor III
  • 1885 Views
  • 2 replies
  • 0 kudos

how to check what jobs cluster to have expanddisk.

We would like to know how to check what jobs cluster to have to expand disk.

  • 1885 Views
  • 2 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

You can check in the cluster's event logs. You can type in the search box, "disk" and you will see all the events there.

  • 0 kudos
1 More Replies
successhawk
by New Contributor II
  • 2408 Views
  • 3 replies
  • 2 kudos

Resolved! Is there a way to tell if a created job is not compliant against configured cluster policies before it runs?

As a DevOps engineer, I want to enforce cluster policies at deployment time when the job is deployed/created, well before it is time to actually use it (i.e. before its scheduled/triggered run time without actually running it).

  • 2408 Views
  • 3 replies
  • 2 kudos
Latest Reply
irfanaziz
Contributor II
  • 2 kudos

Is it not the linked service that defines the kind of cluster created or used for any job?So i believe you could control the configuration via the linked service settings.

  • 2 kudos
2 More Replies
Pragat
by New Contributor
  • 1273 Views
  • 1 replies
  • 0 kudos

Databricks job parameterization

I am configuring an Databricks jobs using multiple notebooks having dependency with each other. All the notebooks are parameterized and using similiar parameters. How can i configure the parameter on global level so that all the notebooks can consume...

  • 1273 Views
  • 1 replies
  • 0 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 0 kudos

actually, it is very hard but if you want to use an alternative option you have to change your code and use a widget feature of data bricks.May be this is not a right option but you can still explore this doc for testing purpose https://docs.databric...

  • 0 kudos
alhuelamo
by New Contributor II
  • 7878 Views
  • 4 replies
  • 1 kudos

Getting non-traceable NullPointerExceptions

We're running a job that's issuing NullPointerException without traces of our job's code.Does anybody know what would be the best course of action when it comes to debugging these issues?The job is a Scala job running on DBR 11.3 LTS.In case it's rel...

  • 7878 Views
  • 4 replies
  • 1 kudos
Latest Reply
UmaMahesh1
Honored Contributor III
  • 1 kudos

NullPointerException will occur when you are accessing an instance method or if you are trying to access elements in a null array or you are calling a method on an object referred by null value. To give you suggestion on how to avoid that, we might ...

  • 1 kudos
3 More Replies
mr_poola49
by New Contributor III
  • 1956 Views
  • 0 replies
  • 5 kudos

Azure Databricks Jobs Connection Timeout (Read Failed)

Azure Databricks Jobs failed intermittently due to connection timeout (Read Failed) while executing a MS SQL stored procedure which is in Azure SQL database.My requirement is to process delta records(Get delta records using last refresh date) from Da...

  • 1956 Views
  • 0 replies
  • 5 kudos
Labels