cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

An error occurred while loading the model. Failed to load the pickled function from a hexadecimal

marcelo2108
Contributor

[8586fsbgpb] An error occurred while loading the model. Failed to load the pickled function from a hexadecimal string. Error: Can't get attribute 'transform_input' on <module '__main__' from '/opt/conda/envs/mlflow-env/bin/gunicorn'>.

I´m using the function to transform input and output on this way

def transform_input(**request):
    print('Type of prompt',type(request["prompt"]))
    request["messages"] = [
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": request["prompt"]},
        ]
    request["stop"] = ['\n\n']
    print("Request format",request)
    return request

def transform_output(response):
    return response['candidates'][0]

# If using serving endpoint, the model serving endpoint is created in `02_[chat]_mlflow_logging_inference`
llm = Databricks(endpoint_name='llama2-7b-chat-completion',
                 transform_input_fn=transform_input,
                 transform_output_fn=transform_output,extra_params={"temperature":0.01,"max_tokens": 300})


Is there anything else I´m missing to avoid this error ?
1 ACCEPTED SOLUTION

Accepted Solutions

marcelo2108
Contributor

The solution I found was to create those functions in a separated python code called eg. custom_functions.py and deploy as follows in ml flow

with mlflow.start_run() as run:
        signature = infer_signature(question, answer)
        logged_model = mlflow.langchain.log_model(
            chain,
            artifact_path="chain",
            registered_model_name=registered_model_name,
            loader_fn=get_retriever,
            persist_dir=persist_directory,
            pip_requirements=["mlflow==" + mlflow.__version__,"langchain==" + langchain.__version__,"sentence_transformers","chromadb"],
            code_paths=["custom_functions.py"],
            #conda_env=conda_env,
            input_example=question,
            metadata={"task": "llm/v1/chat"},
            signature=signature,
            await_registration_for=900 # wait for 15 minutes for model registration to complete
        )

View solution in original post

4 REPLIES 4

Kaniz_Fatma
Community Manager
Community Manager

Hi @marcelo2108

The error you're encountering is due to the fact that your transform_input and transform_output functions are defined within the script where you're trying to use them. When you're using MLflow with Databricks, it's important to ensure that your functions are defined at the top level of your script, not within other functions or classes. This is because MLflow needs to be able to import your functions, and when they're defined within other functions or classes, they're not available for import.

To fix this issue, you should define your transform_input and transform_output functions at the top level of your script.

Here's an example of how you can do this:

def transform_input(**request):
    print('Type of prompt',type(request["prompt"]))
    request["messages"] = [
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": request["prompt"]},
        ]
    request["stop"] = ['\n\n']
    print("Request format",request)
    return request

def transform_output(response):
    return response['candidates'][0]

llm = Databricks(endpoint_name='llama2-7b-chat-completion',
                 transform_input_fn=transform_input,
                 transform_output_fn=transform_output,extra_params={"temperature":0.01,"max_tokens": 300})

Now, your functions should be available for import by MLflow, and you should no longer encounter the error you were seeing.

marcelo2108
Contributor

Hi @Kaniz_Fatma I put already on top level of the cell script, exactly you mentioned as in the attachment file but no look. Should I put on the top of notebook ? Any other clue about ?

marcelo2108
Contributor

The solution I found was to create those functions in a separated python code called eg. custom_functions.py and deploy as follows in ml flow

with mlflow.start_run() as run:
        signature = infer_signature(question, answer)
        logged_model = mlflow.langchain.log_model(
            chain,
            artifact_path="chain",
            registered_model_name=registered_model_name,
            loader_fn=get_retriever,
            persist_dir=persist_directory,
            pip_requirements=["mlflow==" + mlflow.__version__,"langchain==" + langchain.__version__,"sentence_transformers","chromadb"],
            code_paths=["custom_functions.py"],
            #conda_env=conda_env,
            input_example=question,
            metadata={"task": "llm/v1/chat"},
            signature=signature,
            await_registration_for=900 # wait for 15 minutes for model registration to complete
        )

marcelo2108
Contributor

However I could not progress in the end I mean because I found the error I reported in other thread as follows

[5bb99fzs2f] An error occurred while loading the model. You haven't configured the CLI yet! Please configure by entering `/opt/conda/envs/mlflow-env/bin/gunicorn configure`.
As described in :

Re: Problem when serving a langchain model on Data... - Databricks - 59506

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!