Error accessing file from dbfs inside mlflow serve endpoint
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-07-2024 05:29 AM
Hi,
I have mlflow model served using serverless GPU which takes audio file name as input and then file will be passed as parameter to huggiung face model inside predict method. But I am getting following error
HFValidationError(\nhuggingface_hub.utils._validators.HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/dbfs/tmp'. Use `repo_type` argument if needed.\n"}
Appreciate any help.
Regards,
Sanjay
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-08-2024 10:56 PM
Thank you @Retired_mod for prompt response. I am able to load the model from Huggingface correctly. The issue is with loading input audio file which is stored in local directory in databrick dbfs. This error is coming only after creating serverless endpoint. Without serverless endpoint, I am able to load model from registry and read incoming audio file from same location and process it.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-17-2025 09:37 AM
I have the same issue.
I have a large file that I cannot access from an MLFlow service.
Things I have tried (none of these work):
- Read-only from DBFS
- `dbfs:/myfolder/myfile.chroma` does not work
- `/dbfs/myfolder/myfile.chroma` does not work
- Read-only from Unity Catalog Volume
- `/Volumes/mycatalog/mydb/myfolder/myfile.chroma` does not work
- Read-only from S3 storage
- `s3://mybucket/mydb/myfolder/myfile.chroma` does not work
So far, the only thing that works is parking the huge file as MLFlow artifact and accessing it locally in the service (awful).
The errors are all similar to:
`ValueError: Dataset at path Volumes/non_prod/metrics/files/recommender/embed-db.chroma was not found`
Notice that the path strips the leading `/` from `/Volumes/` and does the same with `s3://` protocol, etc.
I can't use MLFlow endpoints without this very basic functionality.

