cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Science & Machine Learning

Forum Posts

johnp
by New Contributor III
  • 2685 Views
  • 2 replies
  • 0 kudos

Resolved! pdb debugger on databricks

I am new to databricks. and trying to debug my python application with variable-explore by following the instruction from: https://www.databricks.com/blog/new-debugging-features-databricks-notebooks-variable-explorerI added the "import pdb" in the fi...

  • 2685 Views
  • 2 replies
  • 0 kudos
Latest Reply
johnp
New Contributor III
  • 0 kudos

I test with some simple applications, it works as you described.  However, the application I am debugging uses the pyspark structured streaming, which runs continuously. After inserting pdb.set_trace(), the application paused at the breakpoint, but t...

  • 0 kudos
1 More Replies
kng88
by New Contributor II
  • 3563 Views
  • 6 replies
  • 7 kudos

How to save model produce by distributed training?

I am trying to save model after distributed training via the following codeimport sys   from spark_tensorflow_distributor import MirroredStrategyRunner   import mlflow.keras   mlflow.keras.autolog()   mlflow.log_param("learning_rate", 0.001)   import...

  • 3563 Views
  • 6 replies
  • 7 kudos
Latest Reply
Xiaowei
New Contributor III
  • 7 kudos

I think I finally worked this out.Here is the extra code to save out the model only once and from the 1st node:context = pyspark.BarrierTaskContext.get() if context.partitionId() == 0: mlflow.keras.log_model(model, "mymodel")

  • 7 kudos
5 More Replies
yorabhir
by New Contributor II
  • 1401 Views
  • 1 replies
  • 1 kudos

Resolved! 'error_code': 'INVALID_PARAMETER_VALUE', 'message': 'Too many sources. It cannot be more than 100'

I am getting the following error while saving a delta table in the feature storeWARNING databricks.feature_store._catalog_client_helper: Failed to record data sources in the catalog. Exception: {'error_code': 'INVALID_PARAMETER_VALUE', 'message': 'To...

  • 1401 Views
  • 1 replies
  • 1 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 1 kudos

Hi @yorabhir,  Verify how many sources you’re trying to record in the catalog. If it exceeds 100, you’ll need to reduce the number of sources.Ensure that the feature table creation process is correctly configured. In your code snippet, you’re creatin...

  • 1 kudos
MaKarenina
by New Contributor
  • 983 Views
  • 1 replies
  • 0 kudos

ML Flow until January 24

Hi! When i was creating a new endpoint a have this alert  CREATE A MODEL SERVING ENDPOINT TO SERVE YOUR MODEL BEHIND A REST API INTERFACE. YOU CAN STILL USE LEGACY ML FLOW MODEL SERVING UNTIL JANUARY 2024 I don't understand if my Legacy MLFlow Model ...

  • 983 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @MaKarenina, The alert you received states that you can continue using Legacy MLflow Model Serving until January 2024. However, there are a few important points to consider: Support: After January 2024, Legacy MLflow Model Serving will no lon...

  • 0 kudos
Alessandro
by New Contributor
  • 1343 Views
  • 1 replies
  • 0 kudos

using openai Api in Databricks without iterating rows

 Hi to everyone,I have a delta table with a column 'comment' I would like to add a new column 'sentiment', and I would like to calculate it using openai API.I already know how to create a databricks endpoint to an external model and how to use it (us...

  • 1343 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @Alessandro, Your question is clear, and I appreciate your curiosity about optimizing the process. Let’s explore a couple of approaches: UDF (User-Defined Function): You can create a UDF in Databricks that invokes the OpenAI API for sentiment...

  • 0 kudos
Mirko
by Contributor
  • 1951 Views
  • 3 replies
  • 1 kudos

Resolved! AutoMl Dataset too large

Hello community,i have the following problem: I am using automl to solve a regression model, but in the preprocessing my dataset is sampled to ~30% of the original amount.I am using runtime 14.2 ML Driver: Standard_DS4_v2 28GB Memory 8 coresWorker: S...

  • 1951 Views
  • 3 replies
  • 1 kudos
Latest Reply
Mirko
Contributor
  • 1 kudos

I am pretty sure that i know what the problem was. I had a timestamp column (with second precision) as a feature. If they get one hot encoded, the dataset can get pretty large.

  • 1 kudos
2 More Replies
Miki
by New Contributor II
  • 1069 Views
  • 3 replies
  • 0 kudos

Error: batch scoring with mlflow.keras flavor model

I am logging a trained keras model using the following:  fe.log_model( model=model, artifact_path="wine_quality_prediction", flavor= mlflow.keras, training_set=training_set, registered_model_name=model_name )And when I call the following:predictions_...

Machine Learning
FeatureEngineeringClient
keras
mlflow
  • 1069 Views
  • 3 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @Miki, The OSError: [Errno 30] Read-only file system typically occurs when you attempt to write to a directory that is read-only or does not exist. Let’s explore some possible solutions: Check the Path: Ensure that the path you’ve provided fo...

  • 0 kudos
2 More Replies
stanjs
by New Contributor III
  • 825 Views
  • 2 replies
  • 0 kudos

BAD_REQUEST: ExperimentIds cannot be empty when checking ACLs in bulk

I encountered the error when using Databricks CE to log experiments from mlflow. It worked perfectly fine before, but now I cannot open any of my experiments. I tried clean the cookies, change the browser, and create a new account to manually create ...

  • 825 Views
  • 2 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @stanjs, I understand that you’re facing issues with accessing your MLflow experiments in Databricks CE. Let’s troubleshoot this together. Here are some steps you can take to resolve the issue: Check Experiment Permissions: With the extension ...

  • 0 kudos
1 More Replies
prafull
by New Contributor
  • 826 Views
  • 1 replies
  • 0 kudos

How to use mlflow to log a composite estimator (multiple pipes) and then deploy it as rest endpoint

Hello,I am trying to deploy a composite estimator as single model, by logging the run with mlflow and registering the model.Can anyone help with how this can be done? This estimator contains different chains-text: data- tfidf- svm- svm.decision_funct...

Screenshot 2024-01-17 000758.png
Machine Learning
ML
mlflow
model
python
  • 826 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @prafull , Deploying a composite estimator with MLflow involves several steps. Let’s break it down: Logging the Run with MLflow: First, you’ll need to train your composite estimator using the different pipelines you’ve mentioned (text and cat...

  • 0 kudos
mbejarano89
by New Contributor III
  • 613 Views
  • 1 replies
  • 0 kudos

ApplyInPandas failing at a particular grouped item

Hello,I have a code that performs a forecast for 21k items in parallel. It looks like this: def forward_forecast(data): model = ETSModel(window_data, error='add', trend='add', seasonal=None) fitted_model = model.fit(disp=0) ...

  • 613 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @mbejarano89, The error message you’re encountering, “unsupported operand type(s) for -: ‘NoneType’ and ‘int’”, indicates that you’re trying to perform a subtraction operation between a NoneType and an integer. Let’s break down the issue and expl...

  • 0 kudos
Sam
by New Contributor III
  • 1109 Views
  • 1 replies
  • 0 kudos

MLFlow connection pool warning

Hi,I have a transformer model from Hugging Face I have logged to MLFlow.When I load in using mlflow.transformers.load_model I receive a bunch of warnings: WARNING:urllib3.connectionpool:Connection pool is full, discarding connection: xxxx. Connection...

  • 1109 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @Sam, The warnings you’re encountering are related to urllib3, which is a Python library for handling HTTP connections. Let’s break down the issue and explore potential solutions: Connection Pool Warnings: The warning message indicates that th...

  • 0 kudos
fawzi
by New Contributor
  • 4168 Views
  • 1 replies
  • 0 kudos

MLOPS retraining

https://docs.databricks.com/en/machine-learning/mlops/mlops-workflow.html#7-retrainingIn this article, it is mentioned that we can trigger retraining from the alerts.Triggered. If the monitoring pipeline can identify model performance issues and send...

Machine Learning
Alerts
job
MLOPS
Retraining
Webhook
  • 4168 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @fawzi,  Let’s delve into the world of webhooks and discover how to obtain the URI for a specific job workflow. Webhooks are a powerful mechanism for enabling real-time communication between applications. They allow apps to notify each other ab...

  • 0 kudos
Shumi8
by New Contributor
  • 1131 Views
  • 1 replies
  • 0 kudos

Databricks MlFlow Error: Timed out while evaluating the model.

Hi everyone,I am using databricks and mlflow to create a model and then register it as a serving endpoint. Sometimes the models takes more than 2 minutes to run and after 2 minutes it gives a timeout error:Timed out while evaluating the model. Verify...

  • 1131 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @Shumi8, When dealing with timeout issues in MLflow, it’s essential to configure the relevant parameters to ensure your server remains responsive. Let’s address this step by step: MLFLOW_SCORING_SERVER_REQUEST_TIMEOUT: This parameter controls ...

  • 0 kudos
NaeemS
by New Contributor III
  • 1000 Views
  • 1 replies
  • 0 kudos

Handling Null Values in Feature Stores

Hi, I am using multiple feature stores in my workflow using feature lookups. In my logged pipeline, I have several stages, including Assembler, Standard Scaler, Indexer and then Model. However, I am facing an issue during inference using the `score b...

Machine Learning
Feature Store
  • 1000 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @NaeemS , Handling null values in feature stores is crucial to ensure robustness and reliability in your machine learning pipelines. Let’s explore some strategies to address this issue: Custom Transformer Stage: You’ve already considered addin...

  • 0 kudos
AdamIH123
by New Contributor
  • 1257 Views
  • 1 replies
  • 0 kudos

Feature Store Log Model and Score Batch - env_manager

Hi Everyone. I have a couple of questions about the feature store log model and score batch. After you log a model with the feature store then use fs.score_batch is it possible to pass the env_manager to predict with the same env as training as descr...

fs_score_batch.png
Machine Learning
feature_store
log_model
score_batch
  • 1257 Views
  • 1 replies
  • 0 kudos
Latest Reply
MohsenJ
Contributor
  • 0 kudos

I also like to know if that works. 

  • 0 kudos
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!

Labels