cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

migq2
by New Contributor III
  • 2639 Views
  • 1 replies
  • 0 kudos

Cannot log SparkML model to Unity Catalog due to missing output signature

I am training Spark ML model (concretely a SynapseML LightGBM ) in Databricks using mlflow and autologWhen I try to register my model in Unity catalog I get the following error:  MlflowException: Model passed for registration contained a signature th...

  • 2639 Views
  • 1 replies
  • 0 kudos
rahuja
by Contributor
  • 1170 Views
  • 1 replies
  • 0 kudos

Create Databricks Dashboards on MLFlow Metrics

HelloCurrently we have multiple ML models running in Production which are logging metrics and other meta-data on mlflow. I wanted to ask is it possible somehow to build Databricks dashboards on top of this data and also can this data be somehow avail...

  • 1170 Views
  • 1 replies
  • 0 kudos
Latest Reply
rahuja
Contributor
  • 0 kudos

Hello @Retired_mod Thanks for responding. I think you  are talking about using the Python API. But we don't want that is it possible since MLFlow also uses an sql table to store metrics. To expose those tables as a part of our meta-store and build da...

  • 0 kudos
datastones
by Contributor
  • 3690 Views
  • 2 replies
  • 0 kudos

Resolved! ML model promotion from Databricks dev workspace to prod workspace

Hi everybody. I am relatively new to Databricks. I am working on an ML model promotion process between different Databricks workspaces. I am aware that best practice should be deployment as code (e.g. export the whole training pipeline and model regi...

  • 3690 Views
  • 2 replies
  • 0 kudos
Latest Reply
amr
Databricks Employee
  • 0 kudos

I am aware that models registered in Databricks Unity Catalog (UC) in the prod workspace can be loaded from dev workspace for model comparison/debugging. But to comply with best practices, we restrict access to assets in UC in the dev workspace fro...

  • 0 kudos
1 More Replies
hadoan
by New Contributor II
  • 1122 Views
  • 0 replies
  • 0 kudos

Cannot use Databricks ARC as demo code

I read the link about Databricks ARC - https://github.com/databricks-industry-solutions/auto-data-linkageand run on DBR 12.2 LTS ML runtime environment on DB cloud communityBut I got the error below: 2024/07/08 04:25:33 INFO mlflow.tracking.fluent: E...

  • 1122 Views
  • 0 replies
  • 0 kudos
adrianna2942842
by New Contributor III
  • 3593 Views
  • 1 replies
  • 0 kudos

Deployment with model serving failed after entering "DEPLOYMENT_READY" state

Hi, I was trying to update a config for an endpoint, by adding a new version of an entity (version 7). The new model entered "DEPLOYMENT_READY" state, but the deployment failed with timed out exception. I didn't get any other exception in Build or Se...

deployment_fail.PNG deployment_failed2.PNG
  • 3593 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kumaran
Databricks Employee
  • 0 kudos

Hi @adrianna2942842, Thank you for contacting the Databricks community. May I know how you are loading the model?

  • 0 kudos
ChanduBhujang
by New Contributor II
  • 1128 Views
  • 1 replies
  • 0 kudos

Pyspark models iterative/augmented training capability

Does Pyspark tree based models have iterative or augmented training capabilities ? Similar to sklearn package can be used to train models using model artifact and use that model to train using additional data?  #ML_Models_Pyspark

  • 1128 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kumaran
Databricks Employee
  • 0 kudos

Hi @ChanduBhujang, Thank you for contacting Databricks community. PySpark tree-based models do not have built-in iterative or augmented training capabilities like Scikit-learn's partial_fit method. While there are workarounds to update the model wit...

  • 0 kudos
Solide
by New Contributor
  • 14441 Views
  • 7 replies
  • 6 kudos

Databricks runtime version Error

Hello,I'm following courses on the Databricks academy and using for that purpose the Databricks Community edition using a runtime 12.2 LTS (includes Apache Spark 3.3.2, Scala 2.12) and I believe it can't be changedI'm following the Data engineering c...

  • 14441 Views
  • 7 replies
  • 6 kudos
Latest Reply
V2dha
New Contributor III
  • 6 kudos

I was facing the same error. This could be resolved by adding the version that you are currently working with in the config function present in '_common' notebook in the "Includes' folder. (This was the case of my folder structure that I downloaded f...

  • 6 kudos
6 More Replies
Psybelo
by New Contributor II
  • 5015 Views
  • 4 replies
  • 3 kudos

DE 2.2 - Providing Options for External Sources - Classroom setup error

Hi All,I am unable to execute "Classroom-Setup-02.2" setup in Data Engineering Course. There is the following error: FileNotFoundError: [errno 2] no such file or directory: '/dbfs/mnt/dbacademy-datasets/data-engineer-learning-path/v01/ecommerce/raw/u...

FileNotFoundError
  • 5015 Views
  • 4 replies
  • 3 kudos
Latest Reply
Eagle78
New Contributor III
  • 3 kudos

Inspired by https://stackoverflow.com/questions/58984925/pandas-missing-read-parquet-function-in-azure-databricks-notebookI changed df = pd.read_parquet(path = datasource_path.replace("dbfs:/", '/dbfs/')) # original, error!intodf = spark.read.format(...

  • 3 kudos
3 More Replies
bbashuk
by New Contributor II
  • 3185 Views
  • 1 replies
  • 0 kudos

How to implement early stop in SparkXGBRegressor with Pipeline?

Trying to implement an Early Stopping mechanism in SparkXGBRegressor model with Pipeline:  from pyspark.ml.feature import VectorAssembler, StringIndexer from pyspark.ml import Pipeline, PipelineModel from xgboost.spark import SparkXGBRegressor from x...

  • 3185 Views
  • 1 replies
  • 0 kudos
Latest Reply
bbashuk
New Contributor II
  • 0 kudos

Ok, I finally solved it - added a column to the dataset validation_indicator_col='validation_0', and did not pass it the the VectorAssembler:xgboost_regressor = SparkXGBRegressor() xgboost_regressor.setParams( gamma=0.2, max_depth=6, obje...

  • 0 kudos
simranisanewbie
by New Contributor II
  • 1784 Views
  • 0 replies
  • 1 kudos

Pyspark custom Transformer class -AttributeError: 'DummyMod' object has no attribute 'MyTransformer'

I am trying to create a custom transformer as a stage in my pipeline. A few of the transformations I am doing via SparkNLP and the next few using MLlib. To pass the result of SparkNLP transformation at a stage to the next MLlib transformation, I need...

Machine Learning
Custom Transformer
ML FLow
  • 1784 Views
  • 0 replies
  • 1 kudos
Octavian1
by Contributor
  • 4008 Views
  • 3 replies
  • 1 kudos

port undefined error in SQLDatabase.from_databricks (langchain.sql_database)

The following assignment:from langchain.sql_database import SQLDatabasedbase = SQLDatabase.from_databricks(catalog=catalog, schema=db,host=host, api_token=token,)fails with ValueError: invalid literal for int() with base 10: ''because ofcls._assert_p...

  • 4008 Views
  • 3 replies
  • 1 kudos
Latest Reply
vburam
New Contributor II
  • 1 kudos

I am also facing the same issue. not able to connect even after using sqlalchemy

  • 1 kudos
2 More Replies
Betul
by New Contributor
  • 1120 Views
  • 1 replies
  • 0 kudos

How to do cicd with different models/versions using databricks resources?

Generally speaking what are the tips to make cicd process better with having different versions and models?

  • 1120 Views
  • 1 replies
  • 0 kudos
Latest Reply
robbe
Contributor
  • 0 kudos

Hi @Betul, I think that there are different ways but it really depends on what do you mean by different models and versions.One simple option is to use Databricks Asset Bundles to create multiple workflows (one for each model) and use the champion-ch...

  • 0 kudos
rasgaard
by New Contributor
  • 2827 Views
  • 1 replies
  • 0 kudos

Model Serving Endpoints - Build configuration and Interactive access

Hi there I have used the Databricks Model Serving Endpoints to serve a model which depends on some config files and a custom library. The library has been included by logging the model with the `code_path` argument in `mlflow.pyfunc.log_model` and it...

  • 2827 Views
  • 1 replies
  • 0 kudos
Latest Reply
robbe
Contributor
  • 0 kudos

Hi @rasgaard, one way to achieve that without inspecting the container is to use MLflow artifacts. Artifacts allow you to log files together with your models and reference them inside the endpoint.For example, let's assume that you need to include a ...

  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels