05-01-2022 07:28 AM
I have an mlflow server with `--serve-artifacts` and with postgres as `--backend-store-uri`. The machine(container with base image python:3.9-bullseye) running the server has git installed which is available on path.
I am logging from jupyter-notebooks and these are on containers too(with base image python:3.9-slim-bullseye) and doesn't have git installed.
When I try to auto-log from client like this:
mlflow.sklearn.autolog()
# prepare training data
X = np.array([[1, 1], [1, 2], [2, 2], [2, 3]])
y = np.dot(X, np.array([1, 2])) + 3
# train a model
model = LinearRegression()
model.fit(X, y)
run_id = mlflow.last_active_run().info.run_id
print("Logged data and model in run {}".format(run_id))
I get warning that git is not installed and some more warnings and errors:
2022/05/01 14:21:41 WARNING mlflow.tracking.context.git_context: Failed to import Git (the Git executable is probably not on your PATH), so Git SHA is not available. Error: Failed to initialize: Bad git executable.
The git executable must be specified in one of the following ways:
- be included in your $PATH
- be set via $GIT_PYTHON_GIT_EXECUTABLE
- explicitly set via git.refresh()
All git commands will error until this is rectified.
This initial warning can be silenced or aggravated in the future by setting the
$GIT_PYTHON_REFRESH environment variable. Use one of the following values:
- quiet|q|silence|s|none|n|0: for no warning or exception
- warn|w|warning|1: for a printed warning
- error|e|raise|r|2: for a raised exception
Example:
export GIT_PYTHON_REFRESH=quiet
2022/05/01 14:21:41 INFO mlflow.utils.autologging_utils: Created MLflow autologging run with ID 'e914209e05d449e6af817d0d692b1012', which will track hyperparameters, performance metrics, model artifacts, and lineage information for the current sklearn workflow
2022/05/01 14:22:45 WARNING mlflow.utils.autologging_utils: Encountered unexpected error during sklearn autologging: API request to http://host.docker.internal:5000/api/2.0/mlflow-artifacts/artifacts/1/e914209e05d449e6af817d0d692b10... failed with exception HTTPConnectionPool(host='host.docker.internal', port=5000): Max retries exceeded with url: /api/2.0/mlflow-artifacts/artifacts/1/e914209e05d449e6af817d0d692b1012/artifacts/model/model.pkl (Caused by ResponseError('too many 500 error responses'))
Logged data and model in run e914209e05d449e6af817d0d692b1012
I couldn't figure out why clients need to have git installed and have been under the assumption that clients must only be able to send HTTP requests to server and doesn't need to have anything else installed? what am I missing and how can i avoid that warning, not by not seeing it, but actually fix what's causing it?
05-12-2022 04:49 AM
Hi @Naveen Marthala , This is indeed an MLflow project, and it necessarily requires git.
05-01-2022 08:31 AM
When it is part of the MLflow Project, it requires git.
05-01-2022 09:17 AM
@Hubert Dudek , I still haven't made anything a project, in the context of MlFlow. So, would I need MlFlow irrespective of what I am trying to do?
05-12-2022 04:49 AM
Hi @Naveen Marthala , This is indeed an MLflow project, and it necessarily requires git.
05-18-2022 05:56 AM
Hi @Naveen Marthala , Just a friendly follow-up. Do you still need help or the above responses help you to find the solution? Please let us know.
Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections.
Click here to register and join today!
Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.