Input training dataset field empty in Configure AutoML experiment
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-08-2024 09:18 AM
Trying to start an ML experiment on data in an extant metastore within a catalogue (SQL querys run fine on the database). I can start an ML cluster, then attempt to start an AutoML expirement but I get stuck selecting training data - there are no databases or tables listed?
Do I have to create a dataset, if so where? Can I link the existing metastore within the catalogue, if so how?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-08-2024 07:21 PM
you can read SQL data to DataFrame and run AutoML in the notebook
Train ML models with Databricks AutoML Python API | Databricks on AWS
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-21-2024 02:27 AM
Thanks for the suggestion, it turns out the problem is with the setup of the compute cluster i.e. the cluster I have been using is set up correctly and can access the metastore but isnt ML runtime and when I set up my own ML runtime cluster I cannot configure it to see the metastore. I think this means the metastore is external to databricks and I have asked the admin to either set up an ML cluster with the right authentication or allow me to create one. Long story short I cant do as you suggest because attaching the notebook to an ML cluster also breaks the connection to the SQL query.

