cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Input training dataset field empty in Configure AutoML experiment

sideshowBob1337
New Contributor II

Trying to start an ML experiment on data in an extant metastore within a catalogue (SQL querys run fine on the database).  I can start an ML cluster, then attempt to start an AutoML expirement but I get stuck selecting training data - there are no databases or tables listed?

Do I have to create a dataset, if so where?  Can I link the existing metastore within the catalogue, if so how?

2 REPLIES 2

feiyun0112
Honored Contributor

you can read SQL data to DataFrame and run AutoML in the notebook

Train ML models with Databricks AutoML Python API | Databricks on AWS

Thanks for the suggestion, it turns out the problem is with the setup of the compute cluster i.e. the cluster I have been using is set up correctly and can access the metastore but isnt ML runtime and when I set up my own ML runtime cluster I cannot configure it to see the metastore.  I think this means the metastore is external to databricks  and I have asked the admin to either set up an ML cluster with the right authentication or allow me to create one.  Long story short I cant do as you suggest because attaching the notebook to an ML cluster also breaks the connection to the SQL query. 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group