Databricks Community

iago_gonzalez · ‎05-29-2023

Hello,

I am trying to complete the exercises of the course "Scalable Machine Learning with Apache Spark" using Databricks Community Edition, but when I run the Lab Setup I get the following error:

HTTPError: 503 Server Error: Service Unavailable for url: https://community.cloud.databricks.com/api/2.0/feature-store/feature-tables/search?max_results=10000...
 Response from server: 
 { 'error_code': 'TEMPORARILY_UNAVAILABLE',
  'message': 'The service at /api/2.0/feature-store/feature-tables/search is '
             'temporarily unavailable. Please try again later.'}
Command skipped

It seems that the error is thrown when executing the following code in the "Classroom-Setup" script:

import re
 
DA = DBAcademyHelper(course_config, lesson_config)
DA.reset_lesson()
DA.init()
 
DA.init_mlflow_as_job()
 
DA.conclude_setup()

How could I fix this error?

Thanks!

iago_gonzalez · ‎07-05-2023

@JCV @raviPrakash_21

After reviewing the DBAcademyHelper code, I have seen that the problem is that the Community Edition does not have the following features:
-Feature Store
-MLflow Model Registry
-MLflow Endpoints

I have read that the reason is that in the Community Edition they do not offer tools for production. I think it would be a good idea to include these features but limited (for example, that the Feature Store, Model Registry, Endpoints restart from the Community Edition after several hours), so no one can use Databricks for free in production, but the people we want to do the courses we could use it.

It is possible to modify the setup code to do some exercises, but I don't recommend it because it only allows you to do 1 or 2 exercises (the ones that only use MLflow Experiments), to complete the rest you would need to have Model Registry available.

---

If someone is interested in the setup code, these are the modifications that I made in "Includes/Classroom-Setup".

class CommunityEditionDBAcademyHelper(DBAcademyHelper):

    def cleanup(self, validate_datasets: bool = True) -> None:
        from dbacademy.dbhelper.dataset_manager_class import DatasetManager
        from dbacademy.dbhelper.workspace_cleaner_class import WorkspaceCleaner
        
        wc = WorkspaceCleaner(self)
        status = False
        if self.lesson_config.name is None:
            print(f"Resetting the learning environment:")
        else:
            print(f"Resetting the learning environment ({self.lesson_config.name}):")

        dbgems.spark.catalog.clearCache()
        status = wc._stop_all_streams() or status

        if self.lesson_config.enable_ml_support:
            try:
                status = wc._drop_feature_store_tables(lesson_only=True) or status
            except:
                print("WARNING: Feature Store not available!")
            try:
                status = wc._cleanup_mlflow_endpoints(lesson_only=True) or status
            except:
                print("WARNING: MLflow Model Registry not available!")
            try:
                status = wc._cleanup_mlflow_models(lesson_only=True) or status
            except:
                print("WARNING: Feature Store not available!")
            status = wc._cleanup_experiments(lesson_only=True) or status

        status = wc._drop_catalog() or status
        status = wc._drop_schema() or status

        # Always last to remove DB files that are not removed by sql-drop operations.
        status = wc._cleanup_working_dir() or status

        if not status:
            print("| No action taken")
        
        if validate_datasets:
            DatasetManager.from_dbacademy_helper(self).validate_datasets(fail_fast=True)
        
    
    def reset_lesson(self):
        return self.cleanup(validate_datasets=False)
    
    def reset_learning_environment(self):
        from dbacademy.dbhelper.workspace_cleaner_class import WorkspaceCleaner
    
        wc = WorkspaceCleaner(self)
        print("Resetting the learning environment for all lessons:")

        start = dbgems.clock_start()

        dbgems.spark.catalog.clearCache()
        wc._stop_all_streams()

        if self.lesson_config.enable_ml_support:
            try:
                wc._drop_feature_store_tables(lesson_only=False)
            except:
                print("WARNING: Feature Store not available!")
            try:
                wc._cleanup_mlflow_endpoints(lesson_only=False)
            except:
                print("WARNING: MLflow Model Registry not available!")
            try:
                wc._cleanup_mlflow_models(lesson_only=False)
            except:
                print("WARNING: Feature Store not available!")
            wc._cleanup_experiments(lesson_only=False)

        wc._reset_databases()
        wc._reset_datasets()
        wc._reset_working_dir()

        print(f"| the learning environment was successfully reset {dbgems.clock_stopped(start)}.")


import re

DA = CommunityEditionDBAcademyHelper(course_config, lesson_config)
DA.reset_lesson()
DA.init()

DA.init_mlflow_as_job()

DA.conclude_setup()

View solution in original post

jose_gonzalez · ‎06-14-2023

Adding @Suteja Kanuri and @Vidula Khanna for visibility.

JCV · ‎06-15-2023

Hi,

Same issue here. @Iago Gonzalez did you manage to carry on somehow in the meantime?

Thanks!

raviPrakash_21 · ‎07-03-2023

Hey @iago_gonzalez @JCV @jose_gonzalez any solution? I am facing the same problem too.

JCV · ‎07-04-2023

I think it is all related to the fact of using a community edition unfortunately. Unless otherwise stated by someone else fro DB, no solution other than using a "proper" version of it

iago_gonzalez · ‎07-05-2023

@JCV @raviPrakash_21

After reviewing the DBAcademyHelper code, I have seen that the problem is that the Community Edition does not have the following features:
-Feature Store
-MLflow Model Registry
-MLflow Endpoints

I have read that the reason is that in the Community Edition they do not offer tools for production. I think it would be a good idea to include these features but limited (for example, that the Feature Store, Model Registry, Endpoints restart from the Community Edition after several hours), so no one can use Databricks for free in production, but the people we want to do the courses we could use it.

It is possible to modify the setup code to do some exercises, but I don't recommend it because it only allows you to do 1 or 2 exercises (the ones that only use MLflow Experiments), to complete the rest you would need to have Model Registry available.

---

If someone is interested in the setup code, these are the modifications that I made in "Includes/Classroom-Setup".

class CommunityEditionDBAcademyHelper(DBAcademyHelper):

    def cleanup(self, validate_datasets: bool = True) -> None:
        from dbacademy.dbhelper.dataset_manager_class import DatasetManager
        from dbacademy.dbhelper.workspace_cleaner_class import WorkspaceCleaner
        
        wc = WorkspaceCleaner(self)
        status = False
        if self.lesson_config.name is None:
            print(f"Resetting the learning environment:")
        else:
            print(f"Resetting the learning environment ({self.lesson_config.name}):")

        dbgems.spark.catalog.clearCache()
        status = wc._stop_all_streams() or status

        if self.lesson_config.enable_ml_support:
            try:
                status = wc._drop_feature_store_tables(lesson_only=True) or status
            except:
                print("WARNING: Feature Store not available!")
            try:
                status = wc._cleanup_mlflow_endpoints(lesson_only=True) or status
            except:
                print("WARNING: MLflow Model Registry not available!")
            try:
                status = wc._cleanup_mlflow_models(lesson_only=True) or status
            except:
                print("WARNING: Feature Store not available!")
            status = wc._cleanup_experiments(lesson_only=True) or status

        status = wc._drop_catalog() or status
        status = wc._drop_schema() or status

        # Always last to remove DB files that are not removed by sql-drop operations.
        status = wc._cleanup_working_dir() or status

        if not status:
            print("| No action taken")
        
        if validate_datasets:
            DatasetManager.from_dbacademy_helper(self).validate_datasets(fail_fast=True)
        
    
    def reset_lesson(self):
        return self.cleanup(validate_datasets=False)
    
    def reset_learning_environment(self):
        from dbacademy.dbhelper.workspace_cleaner_class import WorkspaceCleaner
    
        wc = WorkspaceCleaner(self)
        print("Resetting the learning environment for all lessons:")

        start = dbgems.clock_start()

        dbgems.spark.catalog.clearCache()
        wc._stop_all_streams()

        if self.lesson_config.enable_ml_support:
            try:
                wc._drop_feature_store_tables(lesson_only=False)
            except:
                print("WARNING: Feature Store not available!")
            try:
                wc._cleanup_mlflow_endpoints(lesson_only=False)
            except:
                print("WARNING: MLflow Model Registry not available!")
            try:
                wc._cleanup_mlflow_models(lesson_only=False)
            except:
                print("WARNING: Feature Store not available!")
            wc._cleanup_experiments(lesson_only=False)

        wc._reset_databases()
        wc._reset_datasets()
        wc._reset_working_dir()

        print(f"| the learning environment was successfully reset {dbgems.clock_stopped(start)}.")


import re

DA = CommunityEditionDBAcademyHelper(course_config, lesson_config)
DA.reset_lesson()
DA.init()

DA.init_mlflow_as_job()

DA.conclude_setup()

Steph · ‎07-20-2023

Hi @iago_gonzalez ,

I'm a regular user of Databricks in my job. It's a nice ecosystem but I find it a shame that the first cell of a notebook in a specialization doesn't run because the 'Community Edition' doesn't allow you to do things that are necessary for the specialization. A lot of wasted time!

Tone · ‎11-12-2023

Hello,

Has anyone found any solution to that issue? It is confirmed by someone from DB that it is not possible to follow the course with a Community Edition?

Thanks in advance!

AK601 · ‎02-17-2024

I'm experiencing the same issue while using community edition for this classroom: https://github.com/databricks-academy/large-language-models. What subscription level do I upgrade to?

Databricks Community

Scalable ML course error on Lab Setup (Community Edition)

Connect with Databricks Users in Your Area

Databricks Learning Festival (Virtual): 15 January - 31 January 2025

Milestone: DatabricksTV Reaches 100 Videos!

Announcing the new Meta Llama 3.3 model on Databricks

Databricks Community Champion - December 2024 - Sujesh Menon

Dotmatics and Databricks Partner to Advance Scientific Intelligence in Life Sciences