cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
cancel
Showing results for 
Search instead for 
Did you mean: 

Problems with xgboost.spark model loading from MLflow.

Data_Cowboy
New Contributor III

When loading an xgboost model from mlflow following the provided instructions in Databricks hosted MLflow the input sizes I am showing on the job are over 1 TB. Is anyone else using an xgboost.spark model and noticing the same behavior?

Below are some screenshots showing the input size. The job has been running over 15 minutes just to load the model from MLflow.

image.pngimage

1 ACCEPTED SOLUTION

Accepted Solutions

Data_Cowboy
New Contributor III

Getting rid of the call to the full dbfs artifact path seemed to fix the issue for me.

image

View solution in original post

3 REPLIES 3

Data_Cowboy
New Contributor III

Getting rid of the call to the full dbfs artifact path seemed to fix the issue for me.

image

dbx-user7354
New Contributor III

Thank you very much @Data_Cowboy !!! I had the same issue. I even had 14 TiB 😄 

Databricks should really fix this

@dbx-user7354 Glad to hear this solution worked out for you. Makes me feel good that I came back and answered my own post 😀

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.