cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Gopal269673
by Contributor
  • 1724 Views
  • 2 replies
  • 0 kudos

Calling jobs inside another job

Hi All.. I had created 2 job flows and one for transaction layer and another for datamart layer. I need to specify the job dependency between job1 vs Job2 and need to trigger the job2 after completing job1 without using any other orchestration tool o...

  • 1724 Views
  • 2 replies
  • 0 kudos
Latest Reply
Priyag1
Honored Contributor II
  • 0 kudos

Verify with documentation

  • 0 kudos
1 More Replies
Tacuma
by New Contributor II
  • 1913 Views
  • 4 replies
  • 1 kudos

Scheduling jobs with Airflow result in each task running multiple jobs.

Hey everyone, I'm experiementing with running containerized pyspark jobs in Databricks, and orchestrating them with airflow. I am however, encountering an issue here. When I trigger an airflow DAG, and I look at the logs, I see that airflow is spinni...

  • 1913 Views
  • 4 replies
  • 1 kudos
Latest Reply
Tacuma
New Contributor II
  • 1 kudos

Both, I guess? Yes, all jobs share the same config - the question I have is why in the same airflow task log, there are 3 jobs runs. I'm hoping that there's something in the configs and may give me some kind of clue.

  • 1 kudos
3 More Replies
swzzzsw
by New Contributor III
  • 8626 Views
  • 3 replies
  • 9 kudos

"Run now with different parameters" - different parameters not recognized by jobs involving multiple tasks

I'm running a databricks job involving multiple tasks and would like to run the job with different set of task parameters. I can achieve that by edit each task and and change the parameter values. However, it gets very manual when I have a lot of tas...

  • 8626 Views
  • 3 replies
  • 9 kudos
Latest Reply
erens
New Contributor II
  • 9 kudos

Hello,I am also facing with the same issue. The problem is described below:I have a multi-task job. This job consists of multiple "spark_python_task" kind tasks that execute a python script in a spark cluster. This pipeline is created within a CI/CD ...

  • 9 kudos
2 More Replies
dbrick
by New Contributor II
  • 1287 Views
  • 1 replies
  • 1 kudos

Multiple Jobs with different resource requirements on the same cluster

I have a big cluster with the auto-scaling(min:1, max: 25) feature enabled. I want to run multiple jobs on that cluster with different values of spark properties( `--executor-cores` and `–executor-memory) but I don't see any option to specify the sam...

  • 1287 Views
  • 1 replies
  • 1 kudos
Latest Reply
Vidula
Honored Contributor
  • 1 kudos

Hi @Neelesh databricks​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell ...

  • 1 kudos
pawelmitrus
by Contributor
  • 3973 Views
  • 4 replies
  • 1 kudos

Why Databricks spawns multiple jobs

I have a Delta table spark101.airlines (sourced from `/databricks-datasets/airlines/`) partitioned by `Year`. My `spark.sql.shuffle.partitions` is set to default 200. I run a simple query:select Origin, count(*) from spark101.airlines group by Origi...

image
  • 3973 Views
  • 4 replies
  • 1 kudos
Latest Reply
User16753725469
Contributor II
  • 1 kudos

Could you please paste the query plan here to analyse the issue

  • 1 kudos
3 More Replies
Labels