cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Use Spot Instances with Azure Data Factory Linked Service

MarcoCaviezel
New Contributor III

In my pipeline I'm using Azure Data Factory to trigger Databricks notebooks as a linked service

I want to use spot instances for my job clusters

Is there a way to achieve this?

I didn't find a way to do this in the GUI.

Thanks for your help!

Marco

1 ACCEPTED SOLUTION

Accepted Solutions

MarcoCaviezel
New Contributor III

Hi @Werner Stinckensโ€‹ ,

Thanks a lot for your helpful information!

Until now I never worked with pools but after taking a closer look at it I can profit from a lot of other advantages on top of the "All Spot" option.

Have a great day!

Cheers,

Marco

View solution in original post

5 REPLIES 5

Anonymous
Not applicable

@Marco Caviezelโ€‹ - Does the information in this thread, Spot instance in Azure Databricks, answer your question?

-werners-
Esteemed Contributor III

Hi Marco,

the easiest way to do this is using a pool which you defined as 'All Spot'.

In the pool definition you can set 0 as 'min idle' to avoid nodes doing nothing (if you do not want warm instances).

I expect MS to add a 'spot' selector in ADF though, without the need to work with pools.

MarcoCaviezel
New Contributor III

Hi @Werner Stinckensโ€‹ ,

Thanks a lot for your helpful information!

Until now I never worked with pools but after taking a closer look at it I can profit from a lot of other advantages on top of the "All Spot" option.

Have a great day!

Cheers,

Marco

MarcoCaviezel
New Contributor III

Hi @Werner Stinckensโ€‹ ,

Just a quick follow up question.

Does it make sense to you that you can select the following options in Azure Data Factory?

image.pngTo my understanding, "cluster version", "Python Version" and the "Worker options" are defined when I create the cluster in Databricks.

Thanks a lot for your help!

Marco

-werners-
Esteemed Contributor III

It does not make sense indeed as the pool decides this. I think this is still WIP in ADF.

I select the same the python version as the pool, and make sure the autoscaling does not exceed the max nodes.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group