I'm able to enable model serving by using the mlflow api 2.0 with the following code...
instance = f'https://{workspace}.cloud.databricks.com'
headers = {'Authorization': f'Bearer {api_workflow_access_token}'}
# Enable Model Serving
import requests
url = f'{instance}/api/2.0/mlflow/endpoints/enable'
requests.post(url, headers=headers, json={"registered_model_name": f'{model_name}'})
However this automatically sets the cluster setting instance type to be m5a.xlarge, which I DO NOT want it to be. I can manually go into the settings on the UI and change it to be m4.large but I want to be able to do this within the api code above so that I don't have to manually go into the settings and change it.