Machine Learning

by AndersenHuang • Visitor

2 hours ago

6 Views
0 replies
0 kudos

Spacy Retraining failure

Hello, I'm having problems trying to run my retraining notebook for a spacy model. The notebook creates a shell file with the following lines of code: cmd = f''' awk '{{sub("source = ","source = /dbfs/FileStore/{dbfs_folder}/textcat/categories...

Machine Learning

Reply

6 Views
0 replies
0 kudos

2 hours ago

by moh3th1 • Visitor

yesterday

36 Views
1 replies
0 kudos

Optimal Cluster Configuration for Training on Billion-Row Datasets

Hello Databricks Community,I am currently facing a challenge in configuring a cluster for training machine learning models on a dataset consisting of approximately a billion rows and 40 features. Given the volume of data, I want to ensure that the cl...

Machine Learning

Reply

36 Views
1 replies
0 kudos

yesterday

View Replies

Latest Reply

Kaniz
Community Manager

11 hours ago

0 kudos

Hi @moh3th1 , Machine Selection: Memory (RAM): Having sufficient memory is essential for large datasets. Ensure that your machine type has enough RAM to accommodate your data.CPU: CPU power impacts data processing speed. Consider CPUs with multiple...

0 kudos

11 hours ago

by Anonymous • Not applicable

03-01-2022 10:01:00 AM

126398 Views
60 replies
3 kudos

Community Edition Login Issues Below is a list of troubleshooting steps for failing to login with email/password at community.cloud.databricks.com: ...

Community Edition Login Issues Below is a list of troubleshooting steps for failing to login with email/password at community.cloud.databricks.com: Troubleshooting Tips If this is your first time logging in, ensure that you did indeed sign u...

Machine Learning

Reply

126398 Views
60 replies
3 kudos

03-01-2022 10:01:00 AM

View Replies

Latest Reply

akuma67
New Contributor II

12 hours ago

3 kudos

Hey,I have been logged out and even the password reset email is not coming. How much time it takes to resolve?My account is ak.email86@gmail.com

3 kudos

12 hours ago

59 More Replies

by Shreyash • New Contributor

Tuesday

171 Views
4 replies
0 kudos

java.lang.ClassNotFoundException: com.johnsnowlabs.nlp.DocumentAssembler

I am trying to serve a pyspark model using an endpoint. I was able to load and register the model normally. I could also load that model and perform inference but while serving the model, I am getting the following error: [94fffqts54] ERROR StatusLog...

Machine Learning

Model serving

sparknlp

Reply

171 Views
4 replies
0 kudos

Tuesday

View Replies

Latest Reply

Kaniz
Community Manager

Wednesday

0 kudos

Hi @Shreyash, It looks like your code is encountering a java.lang.ClassNotFoundException for the com.johnsnowlabs.nlp.DocumentAssembler class while serving your PySpark model. This error occurs when the required class is not found in the classpath. ...

0 kudos

Wednesday

3 More Replies

by amal15 • New Contributor II

Monday

91 Views
1 replies
0 kudos

XGBoostEstimator is not a member of package ml.dmlc.xgboost4j.scala.spark ?

XGBoostEstimator is not a member of package ml.dmlc.xgboost4j.scala.spark ?How can I resolve this error?

Machine Learning

Reply

91 Views
1 replies
0 kudos

Monday

View Replies

Latest Reply

Kaniz
Community Manager

Wednesday

0 kudos

Hi @amal15, The error message you’re encountering, “XGBoostEstimator is not a member of package ml.dmlc.xgboost4j.scala.spark,” indicates that the XGBoostEstimator class is not being recognized within the specified package. Check Dependencie...

0 kudos

Wednesday

by Colombia • New Contributor

Monday

208 Views
1 replies
0 kudos

Use OF API from package enerbitdso 0.1.8 PYPI

Hello! I have code to use an API supplied in the energitdso package (This is the repository https://pypi.org/project/enerbitdso/). I changed the code adapting it to AZURE DATABRICKS in python, but although there is a connection with the API, it does ...

Machine Learning

Reply

208 Views
1 replies
0 kudos

Monday

View Replies

Latest Reply

Kaniz
Community Manager

Wednesday

0 kudos

Hi @Colombia, To execute a notebook in Azure Databricks programmatically and retrieve its results, you can use the Jobs REST API. Here’s how it works: Create a new job (using the notebook_task parameter) or create a single run (also called RunSubmit...

0 kudos

Wednesday

by e6exghu8 • New Contributor

Monday

161 Views
1 replies
0 kudos

Help - org.apache.spark.SparkException: Job aborted due to stage failure: Task 47 in stage 2842.0

Hello, I am training a SparkXGBRegressor model. It runs without errors if the complexity is low, however when I increase the max_depth and/or num_parallel_tree parameters, I get an error. I checked the cluster metrics during training and it doesn't l...

Machine Learning

Reply

161 Views
1 replies
0 kudos

Monday

View Replies

Latest Reply

Kaniz
Community Manager

Wednesday

0 kudos

Hi @e6exghu8, Ensure that your cluster has sufficient memory to handle the increased complexity (higher max_depth and num_parallel_tree).Check the memory configuration for your Spark executors. You might need to allocate more memory to each executor...

0 kudos

Wednesday

by cmilligan • Contributor II

11-23-2022 12:43:30 PM

3075 Views
3 replies
2 kudos

Issue with Multi-column In predicates are not supported in the DELETE condition.

I'm trying to delete rows from a table with the same date or id as records in another table. I'm using the below query and get the error 'Multi-column In predicates are not supported in the DELETE condition'. delete from cost_model.cm_dispatch_consol...

Machine Learning

Reply

3075 Views
3 replies
2 kudos

11-23-2022 12:43:30 PM

View Replies

Latest Reply

shubhaskar
New Contributor

Wednesday

2 kudos

Had the same issue. Please check the subquery returned value there must be something wrong with that.

2 kudos

Wednesday

2 More Replies

by AChang • New Contributor III

08-22-2023 1:38:44 PM

1813 Views
2 replies
1 kudos

How to fix this runtime error in this Databricks distributed training tutorial workbook

I am following along with this notebook found from this article. I am attempting to fine tune the model with a single node and multiple GPUs, so I run everything up to the "Run Local Training" section, but from there I skip to "Run distributed traini...

Machine Learning

Reply

1813 Views
2 replies
1 kudos

08-22-2023 1:38:44 PM

View Replies

Latest Reply

KYX
New Contributor

Monday

1 kudos

Hi AChang, have you eventually resolved the error? I've also having the same error.

1 kudos

Monday

1 More Replies

by amal15 • New Contributor II

Saturday

360 Views
2 replies
1 kudos

Resolved! import ml.dmlc.xgboost4j.scala.spark.{XGBoostEstimator, XGBoostClassificationModel}

how i can import : import com.microsoft.ml.spark.{LightGBMClassifier,LightGBMClassificationModel}import ml.dmlc.xgboost4j.scala.spark.{XGBoostEstimator, XGBoostClassificationModel} projet spark & scala in databricks

Machine Learning

Reply

360 Views
2 replies
1 kudos

Saturday

View Replies

Latest Reply

amal15
New Contributor II

Monday

1 kudos

XGBoostEstimator is not a member of package ml.dmlc.xgboost4j.scala.spark ?How can I resolve this error?with maven : ml.dmlc:xgboost4j-spark_2.12:2.0.3

1 kudos

Monday

1 More Replies

by chrisf_sts • New Contributor II

Sunday

231 Views
0 replies
0 kudos

Extract calculations naive bayes model

I have a naive Bayes ML model that takes call attributes and predicts if the caller is going to abandon the call while they are on hold waiting to speak to an agent. The model lives in Databricks ML flow, I have it registered. What I need to do is ex...

Machine Learning

Reply

231 Views
0 replies
0 kudos

Sunday

by Lcsp • New Contributor

a week ago

277 Views
0 replies
0 kudos

AssertionError Failed to create the catalog

getting this error when trying to setup the get-started-with-databricks-for-machine-learning LAB . Unity catalog is enabled. Validating the locally installed datasets: | listing local files...(0 seconds) | validation completed...(0 seconds total) C...

Machine Learning

Reply

277 Views
0 replies
0 kudos

a week ago

by amal15 • New Contributor II

a week ago

72 Views
0 replies
0 kudos

error: not found: type XGBoostEstimator

error: not found: type XGBoostEstimator Spark & Scala

Machine Learning

Reply

72 Views
0 replies
0 kudos

a week ago

by tanjil • New Contributor III

2 weeks ago

278 Views
2 replies
0 kudos

Import mlflow Error

Hello, I am trying to replicate this motebook in my environment: mlflow-end-to-end-example - Databricks However, I am getting the following error when I run "import mlflow": "TypeError: bases must be types"How can I solve this issue? Thank you, Tanji...

Machine Learning

Reply

278 Views
2 replies
0 kudos

2 weeks ago

View Replies

Latest Reply

Walter_C
Valued Contributor II

2 weeks ago

0 kudos

Can you share the specific cell of the notebook where you are receiving this error? Have you modified the code or it is the same? Do you have any particular libraries installed on the cluster you are using for the testing?

0 kudos

2 weeks ago

1 More Replies

by Kaizen • Contributor III

a week ago

477 Views
2 replies
0 kudos

Unity Catalog table management with multiple teams members

Hi! How are you guys managing large teams working on the same project. Each member has their own data to save in Unity Catalog.Based on my understanding there is only two ways to manage this:1) Create an individual member schea so they can store thei...

Machine Learning

Reply

477 Views
2 replies
0 kudos

a week ago

View Replies

Latest Reply

Kaizen
Contributor III

a week ago

0 kudos

Any suggestions regarding this?@s_park , @Sujitha , @Debayan

0 kudos

a week ago

1 More Replies

Databricks

Forum Posts

Spacy Retraining failure

Optimal Cluster Configuration for Training on Billion-Row Datasets

Community Edition Login Issues Below is a list of troubleshooting steps for failing to login with email/password at community.cloud.databricks.com: ...

java.lang.ClassNotFoundException: com.johnsnowlabs.nlp.DocumentAssembler

XGBoostEstimator is not a member of package ml.dmlc.xgboost4j.scala.spark ?

Use OF API from package enerbitdso 0.1.8 PYPI

Help - org.apache.spark.SparkException: Job aborted due to stage failure: Task 47 in stage 2842.0

Issue with Multi-column In predicates are not supported in the DELETE condition.

How to fix this runtime error in this Databricks distributed training tutorial workbook

Resolved! import ml.dmlc.xgboost4j.scala.spark.{XGBoostEstimator, XGBoostClassificationModel}

Extract calculations naive bayes model

AssertionError Failed to create the catalog

error: not found: type XGBoostEstimator

Import mlflow Error

Unity Catalog table management with multiple teams members

import ml.dmlc.xgboost4j.scala.spark.{XGBoostEstim...

Query ML Endpoint with R and Curl

'error_code': 'INVALID_PARAMETER_VALUE', 'message'...

AutoMl Dataset too large

Github Datasets/Labs for Large Language Models: Ap...