cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Science & Machine Learning

Forum Posts

dsiu
by New Contributor II
  • 843 Views
  • 1 replies
  • 2 kudos

CountVectorizer no longer works through Azure ML

Hello. I am trying to use the CountVectorizer module as part of our feature engineering. It works on a Databricks notebook directly, but when I try to run the code through Azure with the databricks connection, it throws an error. This isn't the first...

  • 843 Views
  • 1 replies
  • 2 kudos
Latest Reply
Noopur_Nigam
Valued Contributor II
  • 2 kudos

Hi @Danny Siu​ Please check that you are using the latest dbconnect version corresponding to the DBR version that you are using in the databricks cluster.You can check the latest dbr version here: https://pypi.org/project/databricks-connect/#history

  • 2 kudos
studentofml
by New Contributor
  • 804 Views
  • 1 replies
  • 0 kudos

Is Model Serving REST API available?

This is mentioned in:https://learn.microsoft.com/en-us/azure/databricks/mlflow/create-manage-serverless-model-endpointswith api call example, while in:https://learn.microsoft.com/en-us/answers/questions/892678/how-to-enable-databricks-model-serving-w...

  • 804 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @Thou Mather​ , Did you get a chance to go through this doc?

  • 0 kudos
ashrafkhan94
by New Contributor II
  • 1244 Views
  • 2 replies
  • 2 kudos

Resolved! Failure in mlflow.spark.load_model : Random Forrest pretrained model

model = mlflow.spark.load_model(model_uri=f"models:/{model_name}/{model_version}")Log:An error occurred while calling o2861.load.: org.apache.spark.SparkException: Job aborted due to stage failure: Task 4 in stage 4599.0 failed 4 times, most recent f...

  • 1244 Views
  • 2 replies
  • 2 kudos
Latest Reply
Noopur_Nigam
Valued Contributor II
  • 2 kudos

Hi @Ashraf Khan​ Did you get a chance to look into Sean's response. Please let us know if you need more help on this.

  • 2 kudos
1 More Replies
isaac_gritz
by Valued Contributor II
  • 2156 Views
  • 1 replies
  • 1 kudos

Responsible AI on Databricks

Looking to learn how you can use responsible AI toolkits on Databricks? Interested in learning how you can incorporate open source tools like SHAP and Fairlearn with Databricks?I would recommend checking out this blog: Mitigating Bias in Machine Lear...

  • 2156 Views
  • 1 replies
  • 1 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 1 kudos

Awesome!

  • 1 kudos
DiCamps
by New Contributor II
  • 2435 Views
  • 1 replies
  • 3 kudos

Resolved! Installing pyspark.pandas

Hello guys,I'm trying to migrate a python project from Pandas to Pandas API on Spark, on Azure Databricks using MLFlow on a conda env.The thing is I'm getting the next error:Traceback (most recent call last): File "/databricks/mlflow/projects/x/data_...

  • 2435 Views
  • 1 replies
  • 3 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 3 kudos

it should be yes.can you elaborate on how you create your notebook (and the conda env you talk about)?

  • 3 kudos
Ashley1
by Contributor
  • 5136 Views
  • 5 replies
  • 4 kudos

Unity Catalog - existing dbfs mounts and feature store

Hi All, We're currently considering turning on Unity Catalog but before we flick the switch I'm hoping I can get a bit more confidence of what will happen with our existing dbfs mounts and feature store. The bit that makes me nervous is the crede...

  • 5136 Views
  • 5 replies
  • 4 kudos
Latest Reply
karthik_p
Esteemed Contributor
  • 4 kudos

@Ashley Betts​ can you please check below article, as far as i know we can use external mount points by configuring storage credentials in unity catalog . default method is managed tables, but we can point external tables also. 1. you can upgrade exi...

  • 4 kudos
4 More Replies
Slalom_Tobias
by New Contributor III
  • 2114 Views
  • 3 replies
  • 0 kudos

Cannot serialize this model error when attempting MLFlow for SparkNLP

I'm attempting to use MLFlow to register models in Databricks and am following the recipe at...https://nlp.johnsnowlabs.com/docs/en/licensed_serving_spark_nlp_via_api_databricks_mlflowwhen i execute...mlflow.spark.log_model(pipeline, "lemmatizer", co...

  • 2114 Views
  • 3 replies
  • 0 kudos
Latest Reply
Vidula
Honored Contributor
  • 0 kudos

Hi @Tobias Cortese​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Tha...

  • 0 kudos
2 More Replies
KrishZ
by Contributor
  • 6256 Views
  • 4 replies
  • 4 kudos

How to use Parallel processing using Concurrent Jobs in Databricks ?

QuestionIt would be great if you could recommend how I go about solving the below problem. I haven't been able to find much help online. A. Background:A1. I have to text manipulation using python (like concatenation , convert to spacy doc , get verbs...

  • 6256 Views
  • 4 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

Hi @Krishna Zanwar​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Tha...

  • 4 kudos
3 More Replies
j_b
by New Contributor
  • 1181 Views
  • 2 replies
  • 0 kudos

API limit on mlflow.tracking.client.MlflowClient.list_run_infos method?

I'm trying out managed MLflow on Databricks Community edition, with tracking data saved on Databricks and artifacts saved on my own AWS S3 bucket. I created one experiment and logged 768 runs in the experiment. When I try to get the list of the runs ...

  • 1181 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @jae baak​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 0 kudos
1 More Replies
confusedIntern
by New Contributor III
  • 4203 Views
  • 7 replies
  • 2 kudos

MLflow Project run always comes back as status failed.

Hi! This is kind of an urgent question so any help would be greatly appreciated! Thanks so much! So I'm following this tutorial to try to create an MLflow project: https://docs.databricks.com/applications/mlflow/projects.htmlI tried with the example ...

Screen Shot 2022-07-13 at 10.09.07 AM Screen Shot 2022-07-13 at 10.13.48 AM Screen Shot 2022-07-13 at 10.17.26 AM
  • 4203 Views
  • 7 replies
  • 2 kudos
Latest Reply
sean_owen
Honored Contributor II
  • 2 kudos

This is generally not how you use MLflow in Databricks. You are already in Databricks so do not need to send code to Databricks to execute. Instead just run your code in a notebook; there is no need to package as an MLflow Project. Projects are prima...

  • 2 kudos
6 More Replies
confusedIntern
by New Contributor III
  • 1832 Views
  • 4 replies
  • 0 kudos

What are the parameters For MLflow Project file

Hi! I was just wondering what are the parameters For MLflow Project file? I'm following this tutorial to create my own MLflow Project: https://docs.databricks.com/applications/mlflow/projects.htmland within this tutorial, the MLproject file looks lik...

  • 1832 Views
  • 4 replies
  • 0 kudos
Latest Reply
sean_owen
Honored Contributor II
  • 0 kudos

These parameters are parameters that you will specify when you run the MLflow Project with the mlflow CLI. It lets you parameterize your code, and then pass different parameters to it. How you use them is up to your code. These are not model hyperpar...

  • 0 kudos
3 More Replies
deep_thought
by New Contributor III
  • 1406 Views
  • 3 replies
  • 0 kudos

Resolved! How to drop single feature from feature store table

I have a feature store table and I would like to change one of the features from IntegerType to FloatType, I can't merge this change as it violates the schema. Is it possible to drop a single feature from the table and add the revised feature?Current...

  • 1406 Views
  • 3 replies
  • 0 kudos
Latest Reply
Vidula
Honored Contributor
  • 0 kudos

Hi there @_ _​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 0 kudos
2 More Replies
643926
by New Contributor II
  • 800 Views
  • 0 replies
  • 1 kudos

Substantial performance issues/degradation on Databricks when migrating job over to EMR

Versions of Code:Databricks: 7.3 LTS ML (includes Apache Spark 3.0.1, Scala 2.12)AWS EMR: 6.1.0 (Spark 3.0.0, Scala 2.12)https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-610-release.html The problem:Errors in Databricks when replicating job th...

  • 800 Views
  • 0 replies
  • 1 kudos
ssk121995
by New Contributor
  • 595 Views
  • 1 replies
  • 0 kudos

How can I add custom models to Time Series AutoML?

Time Series AutoML currently has very few models for comparison. How can I add some custom models into the mix so that they are compared each time?

  • 595 Views
  • 1 replies
  • 0 kudos
Latest Reply
shan_chandra
Esteemed Contributor
  • 0 kudos

could you please let us know what are the custom models you are looking to add to time series AutoML?

  • 0 kudos
Shuvi
by New Contributor III
  • 1801 Views
  • 3 replies
  • 5 kudos

Resolved! What is the use case of having Azure Synapse(DWH) and Delta Lake ( Gold) given we can connect BI to delta directly

The curated zone is pushed to cloud data warehouse such as Synapse Dedicated SQL Pools which then acts as a serving layer for BI tools and analyst.I believe we can have models in gold layer and have BI connect to this layer or we can have serverless ...

  • 1801 Views
  • 3 replies
  • 5 kudos
Latest Reply
Shuvi
New Contributor III
  • 5 kudos

Thank you, so for a large workload, where we need lot of optimization we might need Synapse, but for a small/medium workload, we might have to stick to Delta Table

  • 5 kudos
2 More Replies
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!

Labels