Databricks

gibbona1 · ‎02-07-2022

I trained a basic image classification model on MNIST using Tensorflow, logging the experiment run with MLflow.

Model: "my_sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 reshape (Reshape)         (None, 28, 28, 1)         0         
                                                                 
 conv2d (Conv2D)           (None, 26, 26, 32)        320       
                                                                 
 max_pooling2d (MaxPooling  (None, 13, 13, 32)       0         
 2D)                                                             
                                                                 
 flatten (Flatten)         (None, 5408)              0         
                                                                 
 dense (Dense)            (None, 100)               540900    
                                                                 
 dense_1 (Dense)            (None, 10)                1010      
                                                                 
=================================================================
Total params: 542,230
Trainable params: 542,230
Non-trainable params: 0
_________________________________________________________________

with mlflow.start_run() as run:
  run_id       = run.info.run_id
  
  mlflow.tensorflow.autolog()
 
  model.fit(trainX, trainY, 
            validation_data = (testX, testY), 
            epochs  = 2, 
            batch_size  = 64)

I then registered the model and enabled model serving.

When trying to send the JSON text through the browser in the form

[{"b64": "AA...AA=="}]

I'm getting errors like the following:

BAD_REQUEST: Encountered an unexpected error while evaluating the model. Verify that the serialized input Dataframe is compatible with the model for inference.
 
Traceback (most recent call last):
  File "/databricks/conda/envs/model-10/lib/python3.8/site-packages/mlflow/pyfunc/scoring_server/__init__.py", line 306, in transformation
    raw_predictions = model.predict(data)
  File "/databricks/conda/envs/model-10/lib/python3.8/site-packages/mlflow/pyfunc/__init__.py", line 605, in predict
    return self._model_impl.predict(data)
  File "/databricks/conda/envs/model-10/lib/python3.8/site-packages/mlflow/keras.py", line 475, in predict
    predicted = _predict(data)
  File "/databricks/conda/envs/model-10/lib/python3.8/site-packages/mlflow/keras.py", line 462, in _predict
    predicted = pd.DataFrame(self.keras_model.predict(data.values))
  File "/databricks/conda/envs/model-10/lib/python3.8/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/databricks/conda/envs/model-10/lib/python3.8/site-packages/tensorflow/python/framework/func_graph.py", line 1147, in autograph_handler
    raise e.ag_error_metadata.to_exception(e)
ValueError: in user code:
 
    File "/databricks/conda/envs/model-10/lib/python3.8/site-packages/keras/engine/training.py", line 1801, in predict_function  *
        return step_function(self, iterator)
    File "/databricks/conda/envs/model-10/lib/python3.8/site-packages/keras/engine/training.py", line 1790, in step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "/databricks/conda/envs/model-10/lib/python3.8/site-packages/keras/engine/training.py", line 1783, in run_step  **
        outputs = model.predict_step(data)
    File "/databricks/conda/envs/model-10/lib/python3.8/site-packages/keras/engine/training.py", line 1751, in predict_step
        return self(x, training=False)
    File "/databricks/conda/envs/model-10/lib/python3.8/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler
        raise e.with_traceback(filtered_tb) from None
    File "/databricks/conda/envs/model-10/lib/python3.8/site-packages/keras/layers/core/reshape.py", line 110, in _fix_unknown_dimension
        raise ValueError(msg)
 
 
ValueError: Exception encountered when calling layer "reshape" (type Reshape).
 
  total size of new array must be unchanged, input_shape = [1], output_shape = [28, 28, 1]
 
  Call arguments received:
 
   • inputs=tf.Tensor(shape=(None, 1), dtype=float32)

This seems to be because I'm passing the image data as an encoded byte string, not a numpy array. According to the TensorFlow documentation, this is how it has to be passed.

If I have an image with shape (28,28,1), called img, I am converting it to the required format like this

image_data = base64.b64encode(img)
json = {"b64": image_data.decode()}

My question has two parts:

How do I adjust my model to handle the b64 encoded string and convert it back to a 28x28 image first?
What is the exact JSON format I need to send the image data to the REST endpoint?

Atanu · ‎03-15-2022

@Anthony Gibbons may be this git should work with your use case - https://github.com/mlflow/mlflow/issues/1661

View solution in original post

Kaniz · ‎02-07-2022

Hi @Anthony Gibbons ! My name is Kaniz, and I'm the technical moderator here. Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question first. Or else I will get back to you soon. Thanks.

Kaniz · ‎02-17-2022

Hi @Anthony Gibbons , Try to convert your input in base64

import base64
 
to_predict = test_images[0]
 
inputs = base64.b64encode(to_predict)

then convert it to Dataframe and send a request

decode it back to original at the backend by

np.frombuffer(base64.b64decode(encoded), np.uint8)

gibbona1 · ‎02-18-2022

Hi @Kaniz Fatma ,

Thanks for your answer!

My question is about this backend. You mean putting this line inside the predict() method?

When I'm defining a sequential model in TensorFlow, how do I incorporate what I want it to do to the input from a request?

Kaniz · ‎02-18-2022

Hi @Anthony Gibbons , This link might help you as well.

Atanu · ‎03-15-2022

@Anthony Gibbons may be this git should work with your use case - https://github.com/mlflow/mlflow/issues/1661

Databricks

Correct setup and format for calling REST API for image classification

How to successfully build GenAI applications

Registration now open! Databricks Data + AI Summit 2024

Meet DBRX, the New Standard for High-Quality LLMs

Data Warehousing in the Era of AI