Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Showing results for 
Search instead for 
Did you mean: 

EasyOcr Endpoint not accepting inputs

New Contributor

Hi all! I am trying to create an endpoint for Easy OCR. I was able to create the experiment using a wrapper class with the code below:



# import libraries

import mlflow
import mlflow.pyfunc
import cloudpickle
import cv2
import re
import easyocr
import base64

import os
import requests
import numpy as np
import pandas as pd
import json
from PIL import Image

# load reader
class EasyOCRWrapper(mlflow.pyfunc.PythonModel):
    def load_context(self, context):
        # Load your EasyOCR model during the deployment context
        self.reader = easyocr.Reader(['en'])
    def predict(self, context, model_input):
        # Make predictions using the loaded EasyOCR model
        return self.reader.readtext(model_input)

# set experiment name
experiment_name = "easyocr_test"

# get images with text
img = "/dbfs/FileStore/path_to_png_image1"
img2 = "/dbfs/FileStore/path_to_png_image2"

# set image examples
example_image_path = img
example_image_description = "Example image for OCR."

# image preprocessing function
def preprocessing(image_path):
        Preprocesses an image in preparation for OCR detection.

        image_path = path to image
        clean_image = preprocessed image
        raw_image = cv2.imread(image_path)
        clean_image = cv2.cvtColor(raw_image, cv2.COLOR_BGR2GRAY)
        return clean_image

# create input example
input_example = preprocessing(img)

# start mlflow run
with mlflow.start_run(run_name="easyocr_run_v2"):

    mlflow.log_artifact(example_image_path, artifact_path="input_images")

    # Log the input image description as a parameter
    mlflow.log_param("input_image_description", example_image_description)

    # Log parameters
    mlflow.log_param("model", "EasyOCR")

    # Log the EasyOCR model wrapper as an artifact
    mlflow.pyfunc.log_model("easyocr_model", python_model=EasyOCRWrapper(), input_example=input_example)

mlflow_ui_url = mlflow.get_tracking_uri()
print(f"MLflow UI: {mlflow_ui_url}")


From here I was able to:

  1. Tested the loaded model and was able to get an output
  2. Registered the model
  3. Created the endpoint with the model

Oh note, the experiment for this run did not create an input schema with this input example. When trying to pass the image to query the model endpoint, I got the following error using the following code:


# intitate functions to query endpoint
def create_tf_serving_json2(data):
  return {'inputs': {name: data[name].tolist() for name in data.keys()} if isinstance(data, dict) else data.tolist()}

def score_model2(dataset):
  url = ''
  headers = {'Authorization': f'Bearer {dbutils.secrets.get("SecretBucket", "DatabricksToken")}', 'Content-Type': 'application/json'}
  ds_dict = {'dataframe_split': dataset.to_dict(orient='split')} if isinstance(dataset, pd.DataFrame) else create_tf_serving_json(dataset)
  data_json = json.dumps(ds_dict, allow_nan=True)
  response = requests.request(method='POST', headers=headers, url=url, data=data_json)
  if response.status_code != 200:
    raise Exception(f'Request failed with status {response.status_code}, {response.text}')
  return response.json()

# test with image
Exception: Request failed with status 400, {"error_code": "BAD_REQUEST", "message": "Encountered an unexpected error while evaluating the model. Verify that the input is compatible with the model for inference. Error 'OpenCV(4.8.1) /io/opencv/modules/imgproc/src/color.simd_helpers.hpp:94: error: (-2:Unspecified error) in function 'cv::impl::{anonymous}::CvtHelper<VScn, VDcn, VDepth, sizePolicy>::CvtHelper(cv::InputArray, cv::OutputArray, int) [with VScn = cv::impl::{anonymous}::Set<1>; VDcn = cv::impl::{anonymous}::Set<3, 4>; VDepth = cv::impl::{anonymous}::Set<0, 2, 5>; cv::impl::{anonymous}::SizePolicy sizePolicy = cv::impl::<unnamed>::NONE; cv::InputArray = const cv::_InputArray&; cv::OutputArray = const cv::_OutputArray&]'\n> Unsupported depth of input image:\n>     'VDepth::contains(depth)'\n> where\n>     'depth' is 4 (CV_32S)\n'", "stack_trace": "Traceback (most recent call last):\n  File \"/opt/conda/envs/mlflow-env/lib/python3.9/site-packages/src/mlflowserving/scoring_server/\", line 457, in transformation\n    raw_predictions = model.predict(data, params=params)\n  File \"/opt/conda/envs/mlflow-env/lib/python3.9/site-packages/mlflow/pyfunc/\", line 491, in predict\n    return _predict()\n  File \"/opt/conda/envs/mlflow-env/lib/python3.9/site-packages/mlflow/pyfunc/\", line 477, in _predict\n    return self._predict_fn(data, params=params)\n  File \"/opt/conda/envs/mlflow-env/lib/python3.9/site-packages/mlflow/pyfunc/\", line 473, in predict\n    return self.python_model.predict(self.context, self._convert_input(model_input))\n  File \"<command-86792240677018>\", line 9, in predict\n  File \"/opt/conda/envs/mlflow-env/lib/python3.9/site-packages/easyocr/\", line 454, in readtext\n    img, img_cv_grey = reformat_input(image)\n  File \"/opt/conda/envs/mlflow-env/lib/python3.9/site-packages/easyocr/\", line 751, in reformat_input\n    img = cv2.cvtColor(image, cv2.COLOR_GRAY2BGR)\ncv2.error: OpenCV(4.8.1) /io/opencv/modules/imgproc/src/color.simd_helpers.hpp:94: error: (-2:Unspecified error) in function 'cv::impl::{anonymous}::CvtHelper<VScn, VDcn, VDepth, sizePolicy>::CvtHelper(cv::InputArray, cv::OutputArray, int) [with VScn = cv::impl::{anonymous}::Set<1>; VDcn = cv::impl::{anonymous}::Set<3, 4>; VDepth = cv::impl::{anonymous}::Set<0, 2, 5>; cv::impl::{anonymous}::SizePolicy sizePolicy = cv::impl::<unnamed>::NONE; cv::InputArray = const cv::_InputArray&; cv::OutputArray = const cv::_OutputArray&]'\n> Unsupported depth of input image:\n>     'VDepth::contains(depth)'\n> where\n>     'depth' is 4 (CV_32S)\n\n"}


From here I have tried:

1. Changing the data type of the inputs and for array inputs reshaped the dimensions

2. Changing the wrapper class to include imports

3. Added the libraries Easy OCR needs directory as I do the experiment run.

Is it possible I am not creating the wrapper function correctly? Please advise on how I can get this endpoint to accept these preprocessed or raw image inputs. Thank you in advance.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.