I'm fitting multiple models in parallel. For each one, I'm logging lots of params and metrics to MLflow. I'm hitting rate limits, causing problems in my jobs.
My dataset has an "item" column which groups the rows into many groups. (Think of these groups as items in a store.) I want to fit 1 ML model per group. Should I tune hyperparameters for each group separately? Or should I tune them for the entire...
2021-09 webinar: Automating the ML Lifecycle With Databricks Machine Learning (Post 2 of 2)Thank you to everyone who joined! You can access the on-demand recording here and the code in this Github repo.We're sharing a subset of the questions asked an...
2021-09 webinar: Automating the ML Lifecycle With Databricks Machine Learning (post 1 of 2)Thank you to everyone who joined the Automating the ML Lifecycle With Databricks Machine Learning webinar! You can access the on-demand recording here and the ...
I believe it's still the best option. That said, it would be good to know what the OData API is needed for. When I added the original answer, Databricks SQL was nowhere near where it is today, and it's now easy to connect DB SQL directly to PowerBI...
The first thing to try is to log in batches. If you are logging each param and metric separately, you're making 1 API call per param and 1 per metric. Instead, you should use the batch logging APIs; e.g. use "log_params" instead of "log_param" http...
For the first question ("which option is better?"), you need to answer that via your understanding of the problem domain.Do you expect similar behavior across the groups (items)?If so, that's a +1 in favor of sharing hyperparameters. And vice versa....
Both are valid choices. By default, I'd recommend using Hyperopt nowadays. Here's the rationale, as pros & cons of each.Spark ML's built-in toolsPros: These fit the Spark ML Pipeline framework, so you can keep using the same type of APIs.Cons: Thes...
The MLflow run was probably created either (a) via notebook autologging or (b) via a call to `mlflow.start_run()`. With (a), when the notebook first logs something to MLflow, it starts a run. But if the notebook is still active and attached to a clu...