cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Problems with xgboost.spark model loading from MLflow.

Data_Cowboy
New Contributor III

When loading an xgboost model from mlflow following the provided instructions in Databricks hosted MLflow the input sizes I am showing on the job are over 1 TB. Is anyone else using an xgboost.spark model and noticing the same behavior?

Below are some screenshots showing the input size. The job has been running over 15 minutes just to load the model from MLflow.

image.pngimage

1 ACCEPTED SOLUTION

Accepted Solutions

Data_Cowboy
New Contributor III

Getting rid of the call to the full dbfs artifact path seemed to fix the issue for me.

image

View solution in original post

3 REPLIES 3

Data_Cowboy
New Contributor III

Getting rid of the call to the full dbfs artifact path seemed to fix the issue for me.

image

dbx-user7354
New Contributor III

Thank you very much @Data_Cowboy !!! I had the same issue. I even had 14 TiB 😄 

Databricks should really fix this

@dbx-user7354 Glad to hear this solution worked out for you. Makes me feel good that I came back and answered my own post 😀

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!