cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

isaac_gritz
by Valued Contributor II
  • 1604 Views
  • 1 replies
  • 1 kudos

Responsible AI on Databricks

Looking to learn how you can use responsible AI toolkits on Databricks? Interested in learning how you can incorporate open source tools like SHAP and Fairlearn with Databricks?I would recommend checking out this blog: Mitigating Bias in Machine Lear...

  • 1604 Views
  • 1 replies
  • 1 kudos
Latest Reply
Kaniz
Community Manager
  • 1 kudos

Awesome!

  • 1 kudos
DiCamps
by New Contributor II
  • 1566 Views
  • 1 replies
  • 3 kudos

Resolved! Installing pyspark.pandas

Hello guys,I'm trying to migrate a python project from Pandas to Pandas API on Spark, on Azure Databricks using MLFlow on a conda env.The thing is I'm getting the next error:Traceback (most recent call last): File "/databricks/mlflow/projects/x/data_...

  • 1566 Views
  • 1 replies
  • 3 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 3 kudos

it should be yes.can you elaborate on how you create your notebook (and the conda env you talk about)?

  • 3 kudos
Ashley1
by Contributor
  • 3971 Views
  • 5 replies
  • 4 kudos

Unity Catalog - existing dbfs mounts and feature store

Hi All, We're currently considering turning on Unity Catalog but before we flick the switch I'm hoping I can get a bit more confidence of what will happen with our existing dbfs mounts and feature store. The bit that makes me nervous is the crede...

  • 3971 Views
  • 5 replies
  • 4 kudos
Latest Reply
karthik_p
Esteemed Contributor
  • 4 kudos

@Ashley Betts​ can you please check below article, as far as i know we can use external mount points by configuring storage credentials in unity catalog . default method is managed tables, but we can point external tables also. 1. you can upgrade exi...

  • 4 kudos
4 More Replies
Slalom_Tobias
by New Contributor III
  • 1538 Views
  • 3 replies
  • 0 kudos

Cannot serialize this model error when attempting MLFlow for SparkNLP

I'm attempting to use MLFlow to register models in Databricks and am following the recipe at...https://nlp.johnsnowlabs.com/docs/en/licensed_serving_spark_nlp_via_api_databricks_mlflowwhen i execute...mlflow.spark.log_model(pipeline, "lemmatizer", co...

  • 1538 Views
  • 3 replies
  • 0 kudos
Latest Reply
Vidula
Honored Contributor
  • 0 kudos

Hi @Tobias Cortese​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Tha...

  • 0 kudos
2 More Replies
KrishZ
by Contributor
  • 5069 Views
  • 4 replies
  • 4 kudos

How to use Parallel processing using Concurrent Jobs in Databricks ?

QuestionIt would be great if you could recommend how I go about solving the below problem. I haven't been able to find much help online. A. Background:A1. I have to text manipulation using python (like concatenation , convert to spacy doc , get verbs...

  • 5069 Views
  • 4 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

Hi @Krishna Zanwar​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Tha...

  • 4 kudos
3 More Replies
j_b
by New Contributor
  • 893 Views
  • 2 replies
  • 0 kudos

API limit on mlflow.tracking.client.MlflowClient.list_run_infos method?

I'm trying out managed MLflow on Databricks Community edition, with tracking data saved on Databricks and artifacts saved on my own AWS S3 bucket. I created one experiment and logged 768 runs in the experiment. When I try to get the list of the runs ...

  • 893 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @jae baak​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 0 kudos
1 More Replies
confusedIntern
by New Contributor III
  • 3048 Views
  • 7 replies
  • 2 kudos

MLflow Project run always comes back as status failed.

Hi! This is kind of an urgent question so any help would be greatly appreciated! Thanks so much! So I'm following this tutorial to try to create an MLflow project: https://docs.databricks.com/applications/mlflow/projects.htmlI tried with the example ...

Screen Shot 2022-07-13 at 10.09.07 AM Screen Shot 2022-07-13 at 10.13.48 AM Screen Shot 2022-07-13 at 10.17.26 AM
  • 3048 Views
  • 7 replies
  • 2 kudos
Latest Reply
sean_owen
Honored Contributor II
  • 2 kudos

This is generally not how you use MLflow in Databricks. You are already in Databricks so do not need to send code to Databricks to execute. Instead just run your code in a notebook; there is no need to package as an MLflow Project. Projects are prima...

  • 2 kudos
6 More Replies
confusedIntern
by New Contributor III
  • 1313 Views
  • 4 replies
  • 0 kudos

What are the parameters For MLflow Project file

Hi! I was just wondering what are the parameters For MLflow Project file? I'm following this tutorial to create my own MLflow Project: https://docs.databricks.com/applications/mlflow/projects.htmland within this tutorial, the MLproject file looks lik...

  • 1313 Views
  • 4 replies
  • 0 kudos
Latest Reply
sean_owen
Honored Contributor II
  • 0 kudos

These parameters are parameters that you will specify when you run the MLflow Project with the mlflow CLI. It lets you parameterize your code, and then pass different parameters to it. How you use them is up to your code. These are not model hyperpar...

  • 0 kudos
3 More Replies
deep_thought
by New Contributor III
  • 894 Views
  • 3 replies
  • 0 kudos

Resolved! How to drop single feature from feature store table

I have a feature store table and I would like to change one of the features from IntegerType to FloatType, I can't merge this change as it violates the schema. Is it possible to drop a single feature from the table and add the revised feature?Current...

  • 894 Views
  • 3 replies
  • 0 kudos
Latest Reply
Vidula
Honored Contributor
  • 0 kudos

Hi there @_ _​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 0 kudos
2 More Replies
643926
by New Contributor II
  • 567 Views
  • 0 replies
  • 1 kudos

Substantial performance issues/degradation on Databricks when migrating job over to EMR

Versions of Code:Databricks: 7.3 LTS ML (includes Apache Spark 3.0.1, Scala 2.12)AWS EMR: 6.1.0 (Spark 3.0.0, Scala 2.12)https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-610-release.html The problem:Errors in Databricks when replicating job th...

  • 567 Views
  • 0 replies
  • 1 kudos
ssk121995
by New Contributor
  • 425 Views
  • 1 replies
  • 0 kudos

How can I add custom models to Time Series AutoML?

Time Series AutoML currently has very few models for comparison. How can I add some custom models into the mix so that they are compared each time?

  • 425 Views
  • 1 replies
  • 0 kudos
Latest Reply
shan_chandra
Honored Contributor III
  • 0 kudos

could you please let us know what are the custom models you are looking to add to time series AutoML?

  • 0 kudos
Shuvi
by New Contributor III
  • 1239 Views
  • 3 replies
  • 5 kudos

Resolved! What is the use case of having Azure Synapse(DWH) and Delta Lake ( Gold) given we can connect BI to delta directly

The curated zone is pushed to cloud data warehouse such as Synapse Dedicated SQL Pools which then acts as a serving layer for BI tools and analyst.I believe we can have models in gold layer and have BI connect to this layer or we can have serverless ...

  • 1239 Views
  • 3 replies
  • 5 kudos
Latest Reply
Shuvi
New Contributor III
  • 5 kudos

Thank you, so for a large workload, where we need lot of optimization we might need Synapse, but for a small/medium workload, we might have to stick to Delta Table

  • 5 kudos
2 More Replies
NSRBX
by Contributor
  • 1474 Views
  • 3 replies
  • 2 kudos

Resolved! How to extract name of Primary Key Columns of feature store table in Databricks environment ?

Hello,Please suggest how to obtain name of primary key columns in my feature store table in hive metastore.'describe' gives me the name of the columns but not the indexesThanks in advance for your help.Regards,

  • 1474 Views
  • 3 replies
  • 2 kudos
Latest Reply
NSRBX
Contributor
  • 2 kudos

Hi Vidula,Yes, I solved the query !I used the function getTable of class FeatureStoreClient(). You have all datas you need : primary keys, timestamp_keys, features of feature store table.Regards

  • 2 kudos
2 More Replies
565050
by New Contributor
  • 2129 Views
  • 2 replies
  • 0 kudos

Can't overwrite to S3 object

We are trying to write the data frame to s3 using: df.write.mode('overwrite').save("s3://BUCKET-NAME/temp"), but recently we are getting the following error: 'com.amazonaws.services.s3.model.MultiObjectDeleteException: One or more objects could not ...

  • 2129 Views
  • 2 replies
  • 0 kudos
Latest Reply
Vidula
Honored Contributor
  • 0 kudos

Hi @Mayank Kasturia​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Th...

  • 0 kudos
1 More Replies
Kash
by Contributor III
  • 752 Views
  • 1 replies
  • 1 kudos

Building a Data Quality pipeline with alerting

Hi there,My question is how do we setup a data-quality pipeline with alerting?Background: We would like to setup a data-quality pipeline to ensure the data we collect each day is consistent and complete. We will use key metrics found in our bronze JS...

  • 752 Views
  • 1 replies
  • 1 kudos
Latest Reply
User16753725469
Contributor II
  • 1 kudos

Hi @Avkash Kana​  I would suggest using Delta Live Table (DLT) it has the features you are looking for https://docs.databricks.com/workflows/delta-live-tables/index.html

  • 1 kudos
Labels