Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
Showing results for 
Search instead for 
Did you mean: 

why does the client need to have git installed for auto-logging to an mlflow server running in "--serve-artifacts" mode?


I have an mlflow server with `--serve-artifacts` and with postgres as `--backend-store-uri`. The machine(container with base image python:3.9-bullseye) running the server has git installed which is available on path.

I am logging from jupyter-notebooks and these are on containers too(with base image python:3.9-slim-bullseye) and doesn't have git installed.

When I try to auto-log from client like this:

# prepare training data
X = np.array([[1, 1], [1, 2], [2, 2], [2, 3]])
y =, np.array([1, 2])) + 3
# train a model
model = LinearRegression(), y)
run_id = mlflow.last_active_run().info.run_id
print("Logged data and model in run {}".format(run_id))

I get warning that git is not installed and some more warnings and errors:

2022/05/01 14:21:41 WARNING mlflow.tracking.context.git_context: Failed to import Git (the Git executable is probably not on your PATH), so Git SHA is not available. Error: Failed to initialize: Bad git executable.
The git executable must be specified in one of the following ways:
    - be included in your $PATH
    - explicitly set via git.refresh()
All git commands will error until this is rectified.
This initial warning can be silenced or aggravated in the future by setting the
$GIT_PYTHON_REFRESH environment variable. Use one of the following values:
    - quiet|q|silence|s|none|n|0: for no warning or exception
    - warn|w|warning|1: for a printed warning
    - error|e|raise|r|2: for a raised exception
    export GIT_PYTHON_REFRESH=quiet
2022/05/01 14:21:41 INFO mlflow.utils.autologging_utils: Created MLflow autologging run with ID 'e914209e05d449e6af817d0d692b1012', which will track hyperparameters, performance metrics, model artifacts, and lineage information for the current sklearn workflow
2022/05/01 14:22:45 WARNING mlflow.utils.autologging_utils: Encountered unexpected error during sklearn autologging: API request to http://host.docker.internal:5000/api/2.0/mlflow-artifacts/artifacts/1/e914209e05d449e6af817d0d692b10... failed with exception HTTPConnectionPool(host='host.docker.internal', port=5000): Max retries exceeded with url: /api/2.0/mlflow-artifacts/artifacts/1/e914209e05d449e6af817d0d692b1012/artifacts/model/model.pkl (Caused by ResponseError('too many 500 error responses'))
Logged data and model in run e914209e05d449e6af817d0d692b1012

I couldn't figure out why clients need to have git installed and have been under the assumption that clients must only be able to send HTTP requests to server and doesn't need to have anything else installed? what am I missing and how can i avoid that warning, not by not seeing it, but actually fix what's causing it?


Accepted Solutions

Hi @Naveen Marthala​ , This is indeed an MLflow project, and it necessarily requires git.

View solution in original post


Esteemed Contributor III

When it is part of the MLflow Project, it requires git.

@Hubert Dudek​ , I still haven't made anything a project, in the context of MlFlow. So, would I need MlFlow irrespective of what I am trying to do?

Hi @Naveen Marthala​ , This is indeed an MLflow project, and it necessarily requires git.

Community Manager
Community Manager

Hi @Naveen Marthala​ ​ , Just a friendly follow-up. Do you still need help or the above responses help you to find the solution? Please let us know.

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!