cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Science & Machine Learning

Forum Posts

Rexe
by New Contributor
  • 385 Views
  • 1 replies
  • 0 kudos

TypeError: float() argument must be a string or a number, not 'StepArtifact'?

How to get the content of a returned variable in zenml without having this error:TypeError: float() argument must be a string or a number, not 'StepArtifact'?

  • 385 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @Rexe, First, verify the data type of the variable you’re trying to convert. If it’s a StepArtifact.You’ll need to extract relevant information before converting it into a float.

  • 0 kudos
rahuja
by New Contributor III
  • 693 Views
  • 3 replies
  • 0 kudos

Accessing Unity Catalog's MLFlow model registry from outside Databricks

Hello EveryoneWe are integrating Unity Catalog in our Organisation's Databricks. In our case we are planning to move our inference from Databricks to Kubernetes. In order to make the inference code use the latest registered model we need to query the...

  • 693 Views
  • 3 replies
  • 0 kudos
Latest Reply
p4pratikjain
Contributor
  • 0 kudos

I have used glue in the past to score models that are registered in Databricks mlflow registry. You need to configure MLFlow on Kubernetes to access your model registry.You can use something like this - https://docs.databricks.com/en/mlflow/access-ho...

  • 0 kudos
2 More Replies
datastones
by Contributor
  • 629 Views
  • 2 replies
  • 0 kudos

Resolved! Deployment as code pattern with double training effort?

Hi everybody, I have a question re: the deployment as code pattern on databricks. I found and watched a great demo here: https://www.youtube.com/watch?v=JApPzAnbfPIMy question is, in the case where I can get read access to prod data in dev env, the d...

  • 629 Views
  • 2 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @datastones, There are a couple of ways to address the redundant model retraining when using the deployment as code pattern on Databricks: Use the "deploy models" paradigm instead of "deploy code" In this approach, you develop and train the model ...

  • 0 kudos
1 More Replies
rahuja
by New Contributor III
  • 404 Views
  • 2 replies
  • 0 kudos

Create Databricks Dashboards on MLFlow Metrics

HelloCurrently we have multiple ML models running in Production which are logging metrics and other meta-data on mlflow. I wanted to ask is it possible somehow to build Databricks dashboards on top of this data and also can this data be somehow avail...

  • 404 Views
  • 2 replies
  • 0 kudos
Latest Reply
rahuja
New Contributor III
  • 0 kudos

Hello @Kaniz_Fatma Thanks for responding. I think you  are talking about using the Python API. But we don't want that is it possible since MLFlow also uses an sql table to store metrics. To expose those tables as a part of our meta-store and build da...

  • 0 kudos
1 More Replies
datastones
by Contributor
  • 805 Views
  • 2 replies
  • 0 kudos

Resolved! ML model promotion from Databricks dev workspace to prod workspace

Hi everybody. I am relatively new to Databricks. I am working on an ML model promotion process between different Databricks workspaces. I am aware that best practice should be deployment as code (e.g. export the whole training pipeline and model regi...

  • 805 Views
  • 2 replies
  • 0 kudos
Latest Reply
amr
Valued Contributor
  • 0 kudos

I am aware that models registered in Databricks Unity Catalog (UC) in the prod workspace can be loaded from dev workspace for model comparison/debugging. But to comply with best practices, we restrict access to assets in UC in the dev workspace fro...

  • 0 kudos
1 More Replies
hadoan
by New Contributor II
  • 557 Views
  • 1 replies
  • 0 kudos

Cannot use Databricks ARC as demo code

I read the link about Databricks ARC - https://github.com/databricks-industry-solutions/auto-data-linkageand run on DBR 12.2 LTS ML runtime environment on DB cloud communityBut I got the error below: 2024/07/08 04:25:33 INFO mlflow.tracking.fluent: E...

  • 557 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @hadoan,  Ensure that the data you are providing to the auto_link function is in the correct format and does not have any issues, such as missing values or inconsistent data types. The ARC package relies on the data being in a valid Spark DataFram...

  • 0 kudos
adrianna2942842
by New Contributor III
  • 612 Views
  • 1 replies
  • 0 kudos

Deployment with model serving failed after entering "DEPLOYMENT_READY" state

Hi, I was trying to update a config for an endpoint, by adding a new version of an entity (version 7). The new model entered "DEPLOYMENT_READY" state, but the deployment failed with timed out exception. I didn't get any other exception in Build or Se...

deployment_fail.PNG deployment_failed2.PNG
  • 612 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kumaran
Valued Contributor III
  • 0 kudos

Hi @adrianna2942842, Thank you for contacting the Databricks community. May I know how you are loading the model?

  • 0 kudos
ChanduBhujang
by New Contributor II
  • 352 Views
  • 1 replies
  • 0 kudos

Pyspark models iterative/augmented training capability

Does Pyspark tree based models have iterative or augmented training capabilities ? Similar to sklearn package can be used to train models using model artifact and use that model to train using additional data?  #ML_Models_Pyspark

  • 352 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kumaran
Valued Contributor III
  • 0 kudos

Hi @ChanduBhujang, Thank you for contacting Databricks community. PySpark tree-based models do not have built-in iterative or augmented training capabilities like Scikit-learn's partial_fit method. While there are workarounds to update the model wit...

  • 0 kudos
Solide
by New Contributor
  • 8970 Views
  • 9 replies
  • 5 kudos

Databricks runtime version Error

Hello,I'm following courses on the Databricks academy and using for that purpose the Databricks Community edition using a runtime 12.2 LTS (includes Apache Spark 3.3.2, Scala 2.12) and I believe it can't be changedI'm following the Data engineering c...

  • 8970 Views
  • 9 replies
  • 5 kudos
Latest Reply
V2dha
New Contributor III
  • 5 kudos

I was facing the same error. This could be resolved by adding the version that you are currently working with in the config function present in '_common' notebook in the "Includes' folder. (This was the case of my folder structure that I downloaded f...

  • 5 kudos
8 More Replies
Deniz_Bilgin
by New Contributor
  • 414 Views
  • 1 replies
  • 0 kudos

Issue Importing transformers Library on Databricks

I'm experiencing an issue when trying to import the "transformers" library in a Databricks notebook. The import statement causes the notebook to hang indefinitely without any error messages. The library works perfectly on my local machine using Anaco...

  • 414 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @Deniz_Bilgin, Make sure the files you’re trying to import are actual Python .py files, not notebook files. Databricks expects Python modules, so ensure that your files have the correct extension.Experiment with importing modules using sys.path...

  • 0 kudos
Psybelo
by New Contributor II
  • 3151 Views
  • 4 replies
  • 3 kudos

DE 2.2 - Providing Options for External Sources - Classroom setup error

Hi All,I am unable to execute "Classroom-Setup-02.2" setup in Data Engineering Course. There is the following error: FileNotFoundError: [errno 2] no such file or directory: '/dbfs/mnt/dbacademy-datasets/data-engineer-learning-path/v01/ecommerce/raw/u...

FileNotFoundError
  • 3151 Views
  • 4 replies
  • 3 kudos
Latest Reply
Eagle78
New Contributor III
  • 3 kudos

Inspired by https://stackoverflow.com/questions/58984925/pandas-missing-read-parquet-function-in-azure-databricks-notebookI changed df = pd.read_parquet(path = datasource_path.replace("dbfs:/", '/dbfs/')) # original, error!intodf = spark.read.format(...

  • 3 kudos
3 More Replies
bbashuk
by New Contributor II
  • 599 Views
  • 1 replies
  • 0 kudos

How to implement early stop in SparkXGBRegressor with Pipeline?

Trying to implement an Early Stopping mechanism in SparkXGBRegressor model with Pipeline:  from pyspark.ml.feature import VectorAssembler, StringIndexer from pyspark.ml import Pipeline, PipelineModel from xgboost.spark import SparkXGBRegressor from x...

  • 599 Views
  • 1 replies
  • 0 kudos
Latest Reply
bbashuk
New Contributor II
  • 0 kudos

Ok, I finally solved it - added a column to the dataset validation_indicator_col='validation_0', and did not pass it the the VectorAssembler:xgboost_regressor = SparkXGBRegressor() xgboost_regressor.setParams( gamma=0.2, max_depth=6, obje...

  • 0 kudos
simranisanewbie
by New Contributor
  • 700 Views
  • 1 replies
  • 0 kudos

Pyspark custom Transformer class -AttributeError: 'DummyMod' object has no attribute 'MyTransformer'

I am trying to create a custom transformer as a stage in my pipeline. A few of the transformations I am doing via SparkNLP and the next few using MLlib. To pass the result of SparkNLP transformation at a stage to the next MLlib transformation, I need...

  • 700 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @simranisanewbie,  Make sure that you’ve imported the MyTransformer class correctly in the code where you’re loading the saved pipeline. Ensure that the import statement matches the actual location of your custom transformer class.In Python, the o...

  • 0 kudos
Octavian1
by Contributor
  • 2265 Views
  • 4 replies
  • 1 kudos

port undefined error in SQLDatabase.from_databricks (langchain.sql_database)

The following assignment:from langchain.sql_database import SQLDatabasedbase = SQLDatabase.from_databricks(catalog=catalog, schema=db,host=host, api_token=token,)fails with ValueError: invalid literal for int() with base 10: ''because ofcls._assert_p...

  • 2265 Views
  • 4 replies
  • 1 kudos
Latest Reply
vburam
New Contributor II
  • 1 kudos

I am also facing the same issue. not able to connect even after using sqlalchemy

  • 1 kudos
3 More Replies
Betul
by New Contributor
  • 318 Views
  • 1 replies
  • 0 kudos

How to do cicd with different models/versions using databricks resources?

Generally speaking what are the tips to make cicd process better with having different versions and models?

  • 318 Views
  • 1 replies
  • 0 kudos
Latest Reply
robbe
New Contributor III
  • 0 kudos

Hi @Betul, I think that there are different ways but it really depends on what do you mean by different models and versions.One simple option is to use Databricks Asset Bundles to create multiple workflows (one for each model) and use the champion-ch...

  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels