Machine Learning

by Orianh • Valued Contributor II

10-30-2022 1:21:52 AM

1766 Views
1 replies
2 kudos

Run mlflow project from a Job.

Hey Guys, I'm trying to make automated process to run ML training sessions using mlflow and databricks jobs.While developing the model on my local machine using IDE, When finished I have a template notebook that get as parameters the mlflow project p...

Machine Learning

Reply

1766 Views
1 replies
2 kudos

10-30-2022 1:21:52 AM

View Replies

Latest Reply

Kaniz_Fatma
Community Manager

11-01-2022 3:49:45 PM

2 kudos

Hi @orian hindi , We haven’t heard from you since the last response, and I was checking back to see if you have a resolution yet. If you have any solution, please share it with the community as it can be helpful to others. Otherwise, we will respon...

2 kudos

11-01-2022 3:49:45 PM

by cheng • New Contributor

10-30-2022 2:57:20 PM

734 Views
0 replies
0 kudos

Prophet/PyStan compiling error in Runtime 10.4 LTS ML

We're upgrading our ML jobs from using Runtime 9.1 LTS ML to Runtime 10.4 LTS ML in Databricks. One of the libraries our jobs relying on is Prophet. From 9.1 to 10.4, both the versions of Prophet (1.0.1) and PyStan (2.19.1.1) haven't changed, however...

Machine Learning

Reply

734 Views
0 replies
0 kudos

10-30-2022 2:57:20 PM

by archanarddy • New Contributor

10-27-2022 7:39:30 AM

541 Views
0 replies
0 kudos

How to check unlinked databricks configs which are not used in any shards

We have a limit of deploying databricks shards and there are few shards that are unused. How can we check and remove these unlinked databricks shards using api calls

Machine Learning

Reply

541 Views
0 replies
0 kudos

10-27-2022 7:39:30 AM

by DavideCagnoni • Contributor

09-27-2022 2:56:52 AM

8060 Views
8 replies
7 kudos

Resolved! How to use python packages from `sys.path` ( in some sort of "edit-mode") which functions on workers too?

The help of `dbx sync` states that ```for the imports to work you need to update the Python path to include this target directory you're syncing to```This works quite well whenever the package is containing only driver-level functions. However, I ran...

Machine Learning

Reply

8060 Views
8 replies
7 kudos

09-27-2022 2:56:52 AM

View Replies

Latest Reply

Scott_B
New Contributor III

10-25-2022 9:13:13 AM

7 kudos

Hi @Davide Cagnoni. Please see my answer to this post https://community.databricks.com/s/question/0D53f00001mUyh2CAC/limitations-with-udfs-wrapping-modules-imported-via-repos-filesI will copy it here for you:If your notebook is in the same Repo as t...

7 kudos

10-25-2022 9:13:13 AM

7 More Replies

by zzy • New Contributor III

10-14-2022 10:07:02 AM

1574 Views
2 replies
2 kudos

Why is GPU accelerated node much slower than CPU node for training a random forest model on databricks?

I have a dataset about 5 million rows with 14 features and a binary target. I decided to train a pyspark random forest classifier on Databricks. The CPU cluster I created contains 2 c4.8xlarge workers (60GB, 36core) and 1 r4.xlarge (31GB, 4core) driv...

Machine Learning

Reply

1574 Views
2 replies
2 kudos

10-14-2022 10:07:02 AM

View Replies

Latest Reply

Hubert-Dudek
Esteemed Contributor III

10-20-2022 5:40:36 AM

2 kudos

In many cases, you need to adjust your code to utilize GPU.

2 kudos

10-20-2022 5:40:36 AM

1 More Replies

by elementalM • New Contributor III

09-30-2022 10:41:04 AM

2333 Views
4 replies
4 kudos

Catch-up Structured Stream hangs on last step of write job to delta sync using toTable

I'm running databricks version 10.4 on gcp. I'm running a structured stream trying to process historical files in a delta table on gcp cloud storage. This source delta table is big but maintained with OPTIMIZE.The stream repartitions which seems to b...

Machine Learning

Reply

2333 Views
4 replies
4 kudos

09-30-2022 10:41:04 AM

View Replies

Latest Reply

Anonymous
Not applicable

10-19-2022 3:21:42 AM

4 kudos

Hi @Dwight Branscombe Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you....

4 kudos

10-19-2022 3:21:42 AM

3 More Replies

by HemanthVak • New Contributor II

09-22-2022 9:44:30 AM

1271 Views
2 replies
3 kudos

How to isolate environments for different projects in a single mlflow server?

I am planning to deploy MLFlow server deployed in Azure as a centralised repositories for my machine learning experiments and runs and to store events and artifacts. I would like to have different environments or isolated environments in the same wor...

Machine Learning

Reply

1271 Views
2 replies
3 kudos

09-22-2022 9:44:30 AM

View Replies

Latest Reply

Anonymous
Not applicable

10-13-2022 2:47:18 AM

3 kudos

Hi @Hemanth Vakacharla Does @Debayan Mukherjee response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!

3 kudos

10-13-2022 2:47:18 AM

1 More Replies

by weldermartins • Honored Contributor

10-08-2022 6:20:27 AM

17036 Views
17 replies
13 kudos

Resolved! Created nested struct schema SPARK - Schema Jira

Hello guys,I'm using Jira API to return "ISSUES". But to be able to use pyspark I need to create the Dataframe passing in the Schema. But I am not able to create the Schema based on the model below. Would you have any ideas?root |-- expand: string ...

Machine Learning

Reply

17036 Views
17 replies
13 kudos

10-08-2022 6:20:27 AM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

10-11-2022 1:21:32 AM

13 kudos

if columns are missing, that particular data is not present in the json. I am not aware of spark skipping columns when reading json with inferschema. There is an option dropFieldIfAllNull but that is False by default.That makes me think: you might ...

13 kudos

10-11-2022 1:21:32 AM

16 More Replies

by data_engineer_s • New Contributor II

09-28-2022 6:37:23 PM

1144 Views
2 replies
0 kudos

Utilize databricks compute for model training from Pycharm IDE

I like to train my machine learning model from Pycharm IDE. But I want to utilize databricks cluster as compute power to speed up the training. Is it possible

Machine Learning

Reply

1144 Views
2 replies
0 kudos

09-28-2022 6:37:23 PM

View Replies

Latest Reply

Anonymous
Not applicable

10-09-2022 12:03:03 AM

0 kudos

Hi @Suvikram Yerramilli Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from yo...

0 kudos

10-09-2022 12:03:03 AM

1 More Replies

by Willis1 • New Contributor

09-30-2022 12:44:17 AM

864 Views
2 replies
1 kudos

Feature Store best practice: refactoring notebook

Hello, I have a question about best practice regarding registering a feature in Databricks feature store.Lets say that I create and register features during the EDA or experiment phase of a ML project. Later the model is moving to production stage ...

Machine Learning

Reply

864 Views
2 replies
1 kudos

09-30-2022 12:44:17 AM

View Replies

Latest Reply

Anonymous
Not applicable

10-08-2022 11:06:39 PM

1 kudos

Hi @Willis Harding Does @Kaniz Fatma response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!

1 kudos

10-08-2022 11:06:39 PM

1 More Replies

by jochoa • New Contributor

10-03-2022 2:01:41 PM

1407 Views
1 replies
0 kudos

Resolved! Issue logging into my account

Hello, I need assistance accessing my account in data bricks community edition. I got an error that my account was locked due to recent suspicious activity. I tried to reset my password but did not get an email with password change instructions. Than...

Machine Learning

Reply

1407 Views
1 replies
0 kudos

10-03-2022 2:01:41 PM

View Replies

Latest Reply

Kaniz_Fatma
Community Manager

10-07-2022 12:39:04 PM

0 kudos

Hi @Juan Ochoa , Thank you for reaching out, and we’re sorry to hear about this log-in issue! We have this Community Edition login troubleshooting post on Community. Please take a look, and follow the troubleshooting steps. If the steps do not resol...

0 kudos

10-07-2022 12:39:04 PM

by jcapplefields88 • New Contributor II

06-10-2022 7:37:11 AM

1412 Views
3 replies
1 kudos

Expose low latency APIs from Deltalake for mobile apps and microservices

My company is using Deltalake to extract customer insights and run batch scoring with ML models. I need to expose this data to some microservices thru gRPC and REST APIs. How to do this? I'm thinking to build Spark pipelines to extract teh data, stor...

Machine Learning

Reply

1412 Views
3 replies
1 kudos

06-10-2022 7:37:11 AM

View Replies

Latest Reply

Noopur_Nigam
Valued Contributor II

10-02-2022 11:53:30 PM

1 kudos

Hi @John Capplefield Gentle follow-up, please let us know if you need further help on this.

1 kudos

10-02-2022 11:53:30 PM

2 More Replies

by dsiu • New Contributor II

08-01-2022 6:42:04 AM

1024 Views
1 replies
2 kudos

CountVectorizer no longer works through Azure ML

Hello. I am trying to use the CountVectorizer module as part of our feature engineering. It works on a Databricks notebook directly, but when I try to run the code through Azure with the databricks connection, it throws an error. This isn't the first...

Machine Learning

Reply

1024 Views
1 replies
2 kudos

08-01-2022 6:42:04 AM

View Replies

Latest Reply

Noopur_Nigam
Valued Contributor II

10-02-2022 11:49:53 PM

2 kudos

Hi @Danny Siu Please check that you are using the latest dbconnect version corresponding to the DBR version that you are using in the databricks cluster.You can check the latest dbr version here: https://pypi.org/project/databricks-connect/#history

2 kudos

10-02-2022 11:49:53 PM

by studentofml • New Contributor

09-29-2022 3:05:22 AM

1015 Views
1 replies
0 kudos

Is Model Serving REST API available?

This is mentioned in:https://learn.microsoft.com/en-us/azure/databricks/mlflow/create-manage-serverless-model-endpointswith api call example, while in:https://learn.microsoft.com/en-us/answers/questions/892678/how-to-enable-databricks-model-serving-w...

Machine Learning

Reply

1015 Views
1 replies
0 kudos

09-29-2022 3:05:22 AM

View Replies

Latest Reply

Kaniz_Fatma
Community Manager

10-02-2022 5:12:45 PM

0 kudos

Hi @Thou Mather , Did you get a chance to go through this doc?

0 kudos

10-02-2022 5:12:45 PM

by ashrafkhan94 • New Contributor II

08-26-2022 12:17:29 AM

1563 Views
2 replies
2 kudos

Resolved! Failure in mlflow.spark.load_model : Random Forrest pretrained model

model = mlflow.spark.load_model(model_uri=f"models:/{model_name}/{model_version}")Log:An error occurred while calling o2861.load.: org.apache.spark.SparkException: Job aborted due to stage failure: Task 4 in stage 4599.0 failed 4 times, most recent f...

Machine Learning

Reply

1563 Views
2 replies
2 kudos

08-26-2022 12:17:29 AM

View Replies

Latest Reply

Noopur_Nigam
Valued Contributor II

09-30-2022 4:27:25 AM

2 kudos

Hi @Ashraf Khan Did you get a chance to look into Sean's response. Please let us know if you need more help on this.

2 kudos

09-30-2022 4:27:25 AM

1 More Replies

Databricks Community

Forum Posts

Run mlflow project from a Job.

Prophet/PyStan compiling error in Runtime 10.4 LTS ML

How to check unlinked databricks configs which are not used in any shards

Resolved! How to use python packages from `sys.path` ( in some sort of "edit-mode") which functions on workers too?

Why is GPU accelerated node much slower than CPU node for training a random forest model on databricks?

Catch-up Structured Stream hangs on last step of write job to delta sync using toTable

How to isolate environments for different projects in a single mlflow server?

Resolved! Created nested struct schema SPARK - Schema Jira

Utilize databricks compute for model training from Pycharm IDE

Feature Store best practice: refactoring notebook

Resolved! Issue logging into my account

Expose low latency APIs from Deltalake for mobile apps and microservices

CountVectorizer no longer works through Azure ML

Is Model Serving REST API available?

Resolved! Failure in mlflow.spark.load_model : Random Forrest pretrained model

Connect with Databricks Users in Your Area

Mlflow not saving flavor correctly

Uninstall whl file from databricks cluster via CLI

Using variables with Databricks Asset Bundles not ...

How to search the run id of an experiment run crea...

Initializing Vector Search index Sync failes with ...