cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Gopal269673
by Contributor
  • 953 Views
  • 2 replies
  • 0 kudos

Calling jobs inside another job

Hi All.. I had created 2 job flows and one for transaction layer and another for datamart layer. I need to specify the job dependency between job1 vs Job2 and need to trigger the job2 after completing job1 without using any other orchestration tool o...

  • 953 Views
  • 2 replies
  • 0 kudos
Latest Reply
Priyag1
Honored Contributor II
  • 0 kudos

Verify with documentation

  • 0 kudos
1 More Replies
Tacuma
by New Contributor II
  • 943 Views
  • 4 replies
  • 1 kudos

Scheduling jobs with Airflow result in each task running multiple jobs.

Hey everyone, I'm experiementing with running containerized pyspark jobs in Databricks, and orchestrating them with airflow. I am however, encountering an issue here. When I trigger an airflow DAG, and I look at the logs, I see that airflow is spinni...

  • 943 Views
  • 4 replies
  • 1 kudos
Latest Reply
Tacuma
New Contributor II
  • 1 kudos

Both, I guess? Yes, all jobs share the same config - the question I have is why in the same airflow task log, there are 3 jobs runs. I'm hoping that there's something in the configs and may give me some kind of clue.

  • 1 kudos
3 More Replies
swzzzsw
by New Contributor III
  • 5471 Views
  • 5 replies
  • 10 kudos

"Run now with different parameters" - different parameters not recognized by jobs involving multiple tasks

I'm running a databricks job involving multiple tasks and would like to run the job with different set of task parameters. I can achieve that by edit each task and and change the parameter values. However, it gets very manual when I have a lot of tas...

  • 5471 Views
  • 5 replies
  • 10 kudos
Latest Reply
erens
New Contributor II
  • 10 kudos

Hello,I am also facing with the same issue. The problem is described below:I have a multi-task job. This job consists of multiple "spark_python_task" kind tasks that execute a python script in a spark cluster. This pipeline is created within a CI/CD ...

  • 10 kudos
4 More Replies
dbrick
by New Contributor II
  • 754 Views
  • 2 replies
  • 1 kudos

Multiple Jobs with different resource requirements on the same cluster

I have a big cluster with the auto-scaling(min:1, max: 25) feature enabled. I want to run multiple jobs on that cluster with different values of spark properties( `--executor-cores` and `–executor-memory) but I don't see any option to specify the sam...

  • 754 Views
  • 2 replies
  • 1 kudos
Latest Reply
Vidula
Honored Contributor
  • 1 kudos

Hi @Neelesh databricks​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell ...

  • 1 kudos
1 More Replies
pawelmitrus
by Contributor
  • 2300 Views
  • 4 replies
  • 1 kudos

Why Databricks spawns multiple jobs

I have a Delta table spark101.airlines (sourced from `/databricks-datasets/airlines/`) partitioned by `Year`. My `spark.sql.shuffle.partitions` is set to default 200. I run a simple query:select Origin, count(*) from spark101.airlines group by Origi...

image
  • 2300 Views
  • 4 replies
  • 1 kudos
Latest Reply
User16753725469
Contributor II
  • 1 kudos

Could you please paste the query plan here to analyse the issue

  • 1 kudos
3 More Replies
Labels