โ05-29-2023 01:11 AM
Hello,
I am trying to complete the exercises of the course "Scalable Machine Learning with Apache Spark" using Databricks Community Edition, but when I run the Lab Setup I get the following error:
HTTPError: 503 Server Error: Service Unavailable for url: https://community.cloud.databricks.com/api/2.0/feature-store/feature-tables/search?max_results=10000...
Response from server:
{ 'error_code': 'TEMPORARILY_UNAVAILABLE',
'message': 'The service at /api/2.0/feature-store/feature-tables/search is '
'temporarily unavailable. Please try again later.'}
Command skipped
It seems that the error is thrown when executing the following code in the "Classroom-Setup" script:
import re
DA = DBAcademyHelper(course_config, lesson_config)
DA.reset_lesson()
DA.init()
DA.init_mlflow_as_job()
DA.conclude_setup()
How could I fix this error?
Thanks!
โ07-05-2023 12:54 AM - edited โ07-05-2023 01:06 AM
After reviewing the DBAcademyHelper code, I have seen that the problem is that the Community Edition does not have the following features:
-Feature Store
-MLflow Model Registry
-MLflow Endpoints
I have read that the reason is that in the Community Edition they do not offer tools for production. I think it would be a good idea to include these features but limited (for example, that the Feature Store, Model Registry, Endpoints restart from the Community Edition after several hours), so no one can use Databricks for free in production, but the people we want to do the courses we could use it.
It is possible to modify the setup code to do some exercises, but I don't recommend it because it only allows you to do 1 or 2 exercises (the ones that only use MLflow Experiments), to complete the rest you would need to have Model Registry available.
---
If someone is interested in the setup code, these are the modifications that I made in "Includes/Classroom-Setup".
class CommunityEditionDBAcademyHelper(DBAcademyHelper):
def cleanup(self, validate_datasets: bool = True) -> None:
from dbacademy.dbhelper.dataset_manager_class import DatasetManager
from dbacademy.dbhelper.workspace_cleaner_class import WorkspaceCleaner
wc = WorkspaceCleaner(self)
status = False
if self.lesson_config.name is None:
print(f"Resetting the learning environment:")
else:
print(f"Resetting the learning environment ({self.lesson_config.name}):")
dbgems.spark.catalog.clearCache()
status = wc._stop_all_streams() or status
if self.lesson_config.enable_ml_support:
try:
status = wc._drop_feature_store_tables(lesson_only=True) or status
except:
print("WARNING: Feature Store not available!")
try:
status = wc._cleanup_mlflow_endpoints(lesson_only=True) or status
except:
print("WARNING: MLflow Model Registry not available!")
try:
status = wc._cleanup_mlflow_models(lesson_only=True) or status
except:
print("WARNING: Feature Store not available!")
status = wc._cleanup_experiments(lesson_only=True) or status
status = wc._drop_catalog() or status
status = wc._drop_schema() or status
# Always last to remove DB files that are not removed by sql-drop operations.
status = wc._cleanup_working_dir() or status
if not status:
print("| No action taken")
if validate_datasets:
DatasetManager.from_dbacademy_helper(self).validate_datasets(fail_fast=True)
def reset_lesson(self):
return self.cleanup(validate_datasets=False)
def reset_learning_environment(self):
from dbacademy.dbhelper.workspace_cleaner_class import WorkspaceCleaner
wc = WorkspaceCleaner(self)
print("Resetting the learning environment for all lessons:")
start = dbgems.clock_start()
dbgems.spark.catalog.clearCache()
wc._stop_all_streams()
if self.lesson_config.enable_ml_support:
try:
wc._drop_feature_store_tables(lesson_only=False)
except:
print("WARNING: Feature Store not available!")
try:
wc._cleanup_mlflow_endpoints(lesson_only=False)
except:
print("WARNING: MLflow Model Registry not available!")
try:
wc._cleanup_mlflow_models(lesson_only=False)
except:
print("WARNING: Feature Store not available!")
wc._cleanup_experiments(lesson_only=False)
wc._reset_databases()
wc._reset_datasets()
wc._reset_working_dir()
print(f"| the learning environment was successfully reset {dbgems.clock_stopped(start)}.")
import re
DA = CommunityEditionDBAcademyHelper(course_config, lesson_config)
DA.reset_lesson()
DA.init()
DA.init_mlflow_as_job()
DA.conclude_setup()
โ06-14-2023 02:03 PM
Adding @Suteja Kanuriโ and @Vidula Khannaโ for visibility.
โ06-15-2023 04:04 AM
Hi,
Same issue here. @Iago Gonzalezโ did you manage to carry on somehow in the meantime?
Thanks!
โ07-03-2023 11:15 PM
Hey @iago_gonzalez @JCV @jose_gonzalez any solution? I am facing the same problem too.
โ07-04-2023 01:51 AM
I think it is all related to the fact of using a community edition unfortunately. Unless otherwise stated by someone else fro DB, no solution other than using a "proper" version of it
โ07-05-2023 12:54 AM - edited โ07-05-2023 01:06 AM
After reviewing the DBAcademyHelper code, I have seen that the problem is that the Community Edition does not have the following features:
-Feature Store
-MLflow Model Registry
-MLflow Endpoints
I have read that the reason is that in the Community Edition they do not offer tools for production. I think it would be a good idea to include these features but limited (for example, that the Feature Store, Model Registry, Endpoints restart from the Community Edition after several hours), so no one can use Databricks for free in production, but the people we want to do the courses we could use it.
It is possible to modify the setup code to do some exercises, but I don't recommend it because it only allows you to do 1 or 2 exercises (the ones that only use MLflow Experiments), to complete the rest you would need to have Model Registry available.
---
If someone is interested in the setup code, these are the modifications that I made in "Includes/Classroom-Setup".
class CommunityEditionDBAcademyHelper(DBAcademyHelper):
def cleanup(self, validate_datasets: bool = True) -> None:
from dbacademy.dbhelper.dataset_manager_class import DatasetManager
from dbacademy.dbhelper.workspace_cleaner_class import WorkspaceCleaner
wc = WorkspaceCleaner(self)
status = False
if self.lesson_config.name is None:
print(f"Resetting the learning environment:")
else:
print(f"Resetting the learning environment ({self.lesson_config.name}):")
dbgems.spark.catalog.clearCache()
status = wc._stop_all_streams() or status
if self.lesson_config.enable_ml_support:
try:
status = wc._drop_feature_store_tables(lesson_only=True) or status
except:
print("WARNING: Feature Store not available!")
try:
status = wc._cleanup_mlflow_endpoints(lesson_only=True) or status
except:
print("WARNING: MLflow Model Registry not available!")
try:
status = wc._cleanup_mlflow_models(lesson_only=True) or status
except:
print("WARNING: Feature Store not available!")
status = wc._cleanup_experiments(lesson_only=True) or status
status = wc._drop_catalog() or status
status = wc._drop_schema() or status
# Always last to remove DB files that are not removed by sql-drop operations.
status = wc._cleanup_working_dir() or status
if not status:
print("| No action taken")
if validate_datasets:
DatasetManager.from_dbacademy_helper(self).validate_datasets(fail_fast=True)
def reset_lesson(self):
return self.cleanup(validate_datasets=False)
def reset_learning_environment(self):
from dbacademy.dbhelper.workspace_cleaner_class import WorkspaceCleaner
wc = WorkspaceCleaner(self)
print("Resetting the learning environment for all lessons:")
start = dbgems.clock_start()
dbgems.spark.catalog.clearCache()
wc._stop_all_streams()
if self.lesson_config.enable_ml_support:
try:
wc._drop_feature_store_tables(lesson_only=False)
except:
print("WARNING: Feature Store not available!")
try:
wc._cleanup_mlflow_endpoints(lesson_only=False)
except:
print("WARNING: MLflow Model Registry not available!")
try:
wc._cleanup_mlflow_models(lesson_only=False)
except:
print("WARNING: Feature Store not available!")
wc._cleanup_experiments(lesson_only=False)
wc._reset_databases()
wc._reset_datasets()
wc._reset_working_dir()
print(f"| the learning environment was successfully reset {dbgems.clock_stopped(start)}.")
import re
DA = CommunityEditionDBAcademyHelper(course_config, lesson_config)
DA.reset_lesson()
DA.init()
DA.init_mlflow_as_job()
DA.conclude_setup()
โ07-20-2023 03:30 AM
Hi @iago_gonzalez ,
I'm a regular user of Databricks in my job. It's a nice ecosystem but I find it a shame that the first cell of a notebook in a specialization doesn't run because the 'Community Edition' doesn't allow you to do things that are necessary for the specialization. A lot of wasted time!
โ11-12-2023 12:30 PM
Hello,
Has anyone found any solution to that issue? It is confirmed by someone from DB that it is not possible to follow the course with a Community Edition?
Thanks in advance!
โ02-17-2024 07:25 AM
I'm experiencing the same issue while using community edition for this classroom: https://github.com/databricks-academy/large-language-models. What subscription level do I upgrade to?
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group