cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Problems with xgboost.spark model loading from MLflow.

Data_Cowboy
New Contributor III

When loading an xgboost model from mlflow following the provided instructions in Databricks hosted MLflow the input sizes I am showing on the job are over 1 TB. Is anyone else using an xgboost.spark model and noticing the same behavior?

Below are some screenshots showing the input size. The job has been running over 15 minutes just to load the model from MLflow.

image.pngimage

1 ACCEPTED SOLUTION

Accepted Solutions

Data_Cowboy
New Contributor III

Getting rid of the call to the full dbfs artifact path seemed to fix the issue for me.

image

View solution in original post

3 REPLIES 3

Data_Cowboy
New Contributor III

Getting rid of the call to the full dbfs artifact path seemed to fix the issue for me.

image

dbx-user7354
New Contributor III

Thank you very much @Data_Cowboy !!! I had the same issue. I even had 14 TiB 😄 

Databricks should really fix this

@dbx-user7354 Glad to hear this solution worked out for you. Makes me feel good that I came back and answered my own post 😀

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group