cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Science & Machine Learning

Forum Posts

Data_Cowboy
by New Contributor III
  • 1752 Views
  • 3 replies
  • 0 kudos

Resolved! Problems with xgboost.spark model loading from MLflow.

When loading an xgboost model from mlflow following the provided instructions in Databricks hosted MLflow the input sizes I am showing on the job are over 1 TB. Is anyone else using an xgboost.spark model and noticing the same behavior? Below are som...

image.png image
  • 1752 Views
  • 3 replies
  • 0 kudos
Latest Reply
dbx-user7354
New Contributor III
  • 0 kudos

Thank you very much @Data_Cowboy !!! I had the same issue. I even had 14 TiB  Databricks should really fix this

  • 0 kudos
2 More Replies
Colombia
by New Contributor II
  • 644 Views
  • 2 replies
  • 1 kudos

Use OF API from package enerbitdso 0.1.8 PYPI

Hello! I have code to use an API supplied in the energitdso package (This is the repository https://pypi.org/project/enerbitdso/). I changed the code adapting it to AZURE DATABRICKS in python, but although there is a connection with the API, it does ...

  • 644 Views
  • 2 replies
  • 1 kudos
Latest Reply
Colombia
New Contributor II
  • 1 kudos

The owner of the package updated it to use the time out as a parameter of up to 20 seconds and updated a dependent package in DataBricks, with the above the problem was solved

  • 1 kudos
1 More Replies
re
by New Contributor II
  • 536 Views
  • 2 replies
  • 0 kudos

RBAC and VectorSearch

When implementing the managed VectorSearch, what is the preferred way to implement row based access control? I see that you can use the filter API during a query, so simple filters using a certain column may work, but what if all the security informa...

  • 536 Views
  • 2 replies
  • 0 kudos
Latest Reply
re
New Contributor II
  • 0 kudos

Thanks AI for summarizing my question. However, you did not actually answer it.

  • 0 kudos
1 More Replies
moh3th1
by New Contributor
  • 601 Views
  • 1 replies
  • 0 kudos

Optimal Cluster Configuration for Training on Billion-Row Datasets

Hello Databricks Community,I am currently facing a challenge in configuring a cluster for training machine learning models on a dataset consisting of approximately a billion rows and 40 features. Given the volume of data, I want to ensure that the cl...

  • 601 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @moh3th1 ,  Machine Selection: Memory (RAM): Having sufficient memory is essential for large datasets. Ensure that your machine type has enough RAM to accommodate your data.CPU: CPU power impacts data processing speed. Consider CPUs with multiple...

  • 0 kudos
Anonymous
by Not applicable
  • 144518 Views
  • 60 replies
  • 5 kudos

Community Edition Login Issues Below is a list of troubleshooting steps for failing to login with email/password at community.cloud.databricks.com:   ...

Community Edition Login Issues   Below  is a list of troubleshooting steps for failing to login with email/password at community.cloud.databricks.com:       Troubleshooting Tips If this is your first time logging in, ensure that you did indeed sign u...

Image Image Image
  • 144518 Views
  • 60 replies
  • 5 kudos
Latest Reply
akuma67
New Contributor II
  • 5 kudos

Hey,I have been logged out and even the password reset email is not coming. How much time it takes to resolve?My account is ak.email86@gmail.com

  • 5 kudos
59 More Replies
amal15
by New Contributor II
  • 397 Views
  • 1 replies
  • 0 kudos

XGBoostEstimator is not a member of package ml.dmlc.xgboost4j.scala.spark ?

XGBoostEstimator is not a member of package ml.dmlc.xgboost4j.scala.spark ?How can I resolve this error?  

  • 397 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @amal15, The error message you’re encountering, “XGBoostEstimator is not a member of package ml.dmlc.xgboost4j.scala.spark,” indicates that the XGBoostEstimator class is not being recognized within the specified package.  Check Dependencie...

  • 0 kudos
e6exghu8
by New Contributor
  • 1095 Views
  • 1 replies
  • 0 kudos

Help - org.apache.spark.SparkException: Job aborted due to stage failure: Task 47 in stage 2842.0

Hello, I am training a SparkXGBRegressor model. It runs without errors if the complexity is low, however when I increase the max_depth and/or num_parallel_tree parameters, I get an error. I checked the cluster metrics during training and it doesn't l...

  • 1095 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @e6exghu8,  Ensure that your cluster has sufficient memory to handle the increased complexity (higher max_depth and num_parallel_tree).Check the memory configuration for your Spark executors. You might need to allocate more memory to each executor...

  • 0 kudos
cmilligan
by Contributor II
  • 4073 Views
  • 3 replies
  • 2 kudos

Issue with Multi-column In predicates are not supported in the DELETE condition.

I'm trying to delete rows from a table with the same date or id as records in another table. I'm using the below query and get the error 'Multi-column In predicates are not supported in the DELETE condition'. delete from cost_model.cm_dispatch_consol...

  • 4073 Views
  • 3 replies
  • 2 kudos
Latest Reply
shubhaskar
New Contributor II
  • 2 kudos

Had the same issue. Please check the subquery returned value there must be something wrong with that.

  • 2 kudos
2 More Replies
AChang
by New Contributor III
  • 2701 Views
  • 2 replies
  • 1 kudos

How to fix this runtime error in this Databricks distributed training tutorial workbook

I am following along with this notebook found from this article. I am attempting to fine tune the model with a single node and multiple GPUs, so I run everything up to the "Run Local Training" section, but from there I skip to "Run distributed traini...

  • 2701 Views
  • 2 replies
  • 1 kudos
Latest Reply
KYX
New Contributor II
  • 1 kudos

Hi AChang, have you eventually resolved the error? I've also having the same error.

  • 1 kudos
1 More Replies
amal15
by New Contributor II
  • 1287 Views
  • 2 replies
  • 1 kudos

Resolved! import ml.dmlc.xgboost4j.scala.spark.{XGBoostEstimator, XGBoostClassificationModel}

how i can import : import com.microsoft.ml.spark.{LightGBMClassifier,LightGBMClassificationModel}import ml.dmlc.xgboost4j.scala.spark.{XGBoostEstimator, XGBoostClassificationModel} projet spark & scala in databricks

  • 1287 Views
  • 2 replies
  • 1 kudos
Latest Reply
amal15
New Contributor II
  • 1 kudos

XGBoostEstimator is not a member of package ml.dmlc.xgboost4j.scala.spark ?How can I resolve this error?with maven : ml.dmlc:xgboost4j-spark_2.12:2.0.3

  • 1 kudos
1 More Replies
Kaizen
by Valued Contributor
  • 891 Views
  • 2 replies
  • 0 kudos

Unity Catalog table management with multiple teams members

Hi! How are you guys managing large teams working on the same project. Each member has their own data to save in Unity Catalog.Based on my understanding there is only two ways to manage this:1) Create an individual member schea so they can store thei...

Kaizen_1-1712681311310.png
  • 891 Views
  • 2 replies
  • 0 kudos
Latest Reply
Kaizen
Valued Contributor
  • 0 kudos

Any suggestions regarding this?@s_park , @Sujitha , @Debayan 

  • 0 kudos
1 More Replies
Kaizen
by Valued Contributor
  • 2147 Views
  • 6 replies
  • 2 kudos

Resolved! Endpoint performance questions

Hi! Had really interesting results from some endpoint performance tests I did. I set up the non-optimized endpoint with zero-cluster scaling and optimized had this feature disabled.1) Why does the non-optimized endpoint have variable response time fo...

Kaizen_1-1710196442817.png Kaizen_0-1710196408535.png Kaizen_2-1710196880601.png
  • 2147 Views
  • 6 replies
  • 2 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 2 kudos

Hi @Kaizen, Let’s delve into your intriguing endpoint performance observations: Variable Response Time: The non-optimized endpoint exhibiting variable response times during different test durations (3600, 1800, and 600 seconds) can be attributed ...

  • 2 kudos
5 More Replies
Nishat
by New Contributor
  • 634 Views
  • 1 replies
  • 0 kudos

Serving a custom transformer class via a pyfunc wrapper for a pyspark recommendation model

I am trying to serve an ALS pyspark model with a custom transformer(for generating user-specific recommendations) via a pyfunc wrapper. Although I can successfully score the logged model, the serving endpoint is throwing the following error.URI '/mod...

  • 634 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @Nishat,  Ensure that the path you’re using for the model artefacts is correctly configured and accessible within your environment.Verify that the model artefacts are stored in a location accessible by the serving endpoint.Double-check the path an...

  • 0 kudos
tiho
by New Contributor
  • 1111 Views
  • 2 replies
  • 0 kudos

Vector Search Index Sync fails in Initializing

Vector Search Index Sync fails in Initializing. This index table was already up and running, and when I tried to sync it, it failed in Initializing. See the attached.  

tiho_0-1709733181256.png
  • 1111 Views
  • 2 replies
  • 0 kudos
Latest Reply
amitpphatak
New Contributor II
  • 0 kudos

Were you able to resolve this issue? I am facing the same issue right now - I am on AWS.

  • 0 kudos
1 More Replies
marcelo2108
by Contributor
  • 12791 Views
  • 26 replies
  • 0 kudos

Problem when serving a langchain model on Databricks

I´m trying to model serving a LLM LangChain Model and every time it fails with this messsage:[6b6448zjll] [2024-02-06 14:09:55 +0000] [1146] [INFO] Booting worker with pid: 1146[6b6448zjll] An error occurred while loading the model. You haven't confi...

  • 12791 Views
  • 26 replies
  • 0 kudos
Latest Reply
marcelo2108
Contributor
  • 0 kudos

Hi @DataWrangler and Team.I got to solve the initial problem from some tips you gave. I used your code as base and did some modifications adapted to what I have, I mean , No UC enabled and not able to use DatabricksEmbeddings, DatabricksVectorSearch ...

  • 0 kudos
25 More Replies
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!

Labels