Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-20-2024 09:34 AM - edited 12-20-2024 09:38 AM
So, I don't understand if there is a possibility to overwrite the same workflow because could be a mess if someone changes a cluster configuration I want to be sure that there is only one workflow activated with the new configuration. I'm saying those things, because there was deployed in production one workflow but this one was replicated with a new cluster configuration, but should be overwritten the existed one, why it creates a new workflow
prod:
workspace:
host: <host_url>
root_path: /Workspace/Users/${workspace.current_user.userName}/.bundle/${bundle.name}/${bundle.target}
mode: production
# permissions:
# - user_name: ${workspace.current_user.userName}
# level: CAN_MANAGE
run_as:
service_principal_name: <sp_id>
sync:
exclude:
- ./notebook/stg/*.*
resources:
jobs:
sync_delta_and_db:
name: sync_delta_and_db_${bundle.target}
schedule: # runs the job every day at 3AM
quartz_cron_expression: "0 0 3 * * ?"
timezone_id: "UTC"
tasks:
- task_key: sync_delta_${bundle.target}
job_cluster_key: sync_delta_${bundle.target}_cluster
notebook_task:
notebook_path: ./notebook/${bundle.target}/db_sync_initial_wip.ipynb
source: WORKSPACE
libraries:
- whl: ${workspace.root_path}/files/dist/<lib>-0.0.1-py3-none-any.whl
job_clusters: # TODO: this needs to be resized once we understand how to handle massive data properly
- job_cluster_key: sync_delta_${bundle.target}_cluster
new_cluster:
spark_version: 15.4.x-scala2.12
node_type_id: Standard_DS3_v2
runtime_engine: PHOTON
num_workers: 0
spark_conf:
spark.databricks.cluster.profile: singleNode
spark.master: local[*]
custom_tags:
ResourceClass: SingleNode