03-23-2023 03:09 PM
Hello, we have Databricks Python workbooks accessing Delta tables. These workbooks are scheduled/invoked by Azure Data Factory. How can I enable Photon on the linked services that are used to call Databricks?
If I specify new job cluster, there does not seem to be any way to specify the job cluster is to be Photon enabled.
There are json parameters under the advanced tab, however I cannot figure out if these will let me specify Photon.
Or, will I have to use an existing cluster?
Thanks
05-06-2024 03:19 AM
03-23-2023 11:51 PM
@Martin Huige
Currently I don't see a way of using Photon through ADF Databricks Linked Service.
What you can do is to use Jobs API to run Databricks notebooks from ADF instead of using Linked Service.
01-16-2024 01:30 PM
I have found if you use a job cluster from a pool in ADF, it starts to create a cluster per data bricks adf activity, and you end up with more than 1 cluster running.
I have a shared computer cluster for ADF with Photon/Unity enabled, and a fixed worker count. I start the Databricks cluster via the REST API before the ETL runs, saving 5/10 mins of cluster start-up time.
Once the ETL finishes, it runs the notebooks via the Databricks ADF activity and stops the cluster after the ETL has finished using the REST API.
It works well and gives you control over what gets spun up. You can also use spot instances to save resource costs.
API Reference : https://docs.databricks.com/api/workspace/clusters/ (start & terminate)
Regards
Toby
https://thedatacrew.com
04-01-2023 10:14 PM
@Martin Huige :
To enable Photon for your Databricks Python workbooks that are scheduled/invoked by Azure Data Factory, you will need to use an existing Databricks cluster that is configured with Photon. At this time, creating a new job cluster in Azure Data Factory does not provide an option to enable Photon.
To use an existing Databricks cluster that is configured with Photon, you can specify the cluster ID in the Databricks Linked Service configuration in Azure Data Factory. To do this, follow these steps:
With this configuration, the Databricks activity in your Azure Data Factory pipeline will use the specified Databricks cluster, which is configured with Photon, to run your Python workbook.
04-03-2023 02:46 PM
thanks you very much!
01-16-2024 12:44 PM
You can enable photon on a Databricks cluster via an ADF linked service. Simply set the cluster version in the linked service to a photon enabled version (i.e. 13.3.x-photon-scala2.12).
05-06-2024 03:19 AM
This worked for us with Job Compute clusters. Thanks!
01-16-2024 11:22 PM
When you create a cluster on Databricks, you can enable Photon by selecting the "Photon" option in the cluster configuration settings. This is typically done when creating a new cluster, and you would find the option in the advanced cluster configuration settings.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group