cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Community Platform Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Error accessing file from dbfs inside mlflow serve endpoint

sanjay
Valued Contributor II

Hi,

I have mlflow model served using serverless GPU which takes audio file name as input and then file will be passed as parameter to huggiung face model inside predict method. But I am getting following error

HFValidationError(\nhuggingface_hub.utils._validators.HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/dbfs/tmp'. Use `repo_type` argument if needed.\n"}

Appreciate any help.

Regards,

Sanjay

 

1 ACCEPTED SOLUTION

Accepted Solutions

Hi @sanjay, I see, it seems like the issue is with accessing the local file from a serverless endpoint. This could be due to the fact that serverless functions often have different access permissions and ....

 

When you create a serverless endpoint, the function is executed in a different environment where the local filesystem might not be directly accessible. This is especially true if the function is running in a containerized environment or on a different ....

 

In the case of Databricks and DBFS (Databricks File System), files that are stored in DBFS can be accessed like they are in the local file system. However, the access path might be different when youโ€™re running in a serverless context.

 

Here are a few things you could try:

 

Check the file path: Make sure that the file path youโ€™re providing is accessible from the serverless function. In Databricks, you can access DBFS paths directly as if they were local paths, but the exact path wo....

Check the permissions: Ensure that the serverless function has the necessary permissions to access the file. This could mean setting the appropriate IAM roles in AWS, or the equivalent in other cloud providers, that allow access to the necessary resources.

Use absolute paths: If youโ€™re using relative paths to access the file, try using absolute paths instead. This can often help resolve issues where the file is not found.

Copy the file to a location accessible by the function: If the serverless function cannot access the local filesystem, you might need to copy the file to a location that the function can access. This could be a cloud storage bucket like S3 on AWS or Blob Storage on Azure.

 

If none of these suggestions help, could you provide more details about your setup? Specifically, information about how youโ€™re running the serverless function, how youโ€™re attempting to access the file in your code, and the exact error message youโ€™re seeing would be helpful. This will enable me to provide more targeted assistance.

View solution in original post

3 REPLIES 3

Kaniz_Fatma
Community Manager
Community Manager

Hi @sanjay, The HFValidationError youโ€™re encountering is typically thrown when the Hugging Face model loading function (from_pretrained) cannot find the model youโ€™re trying to load.

 

This can happen in two scenarios:

  1. If youโ€™re passing a nonexistent path: If the path is in the form โ€˜repo_nameโ€™ or โ€˜namespace/repo_nameโ€™, ModelHubMixin.from_pretrained will throw a FileNotFoundError. Otherwise, it will throw an HFValidationError because the path does not exist locally and is not in ....
  2. If you pass an existing local folder with no body, SentenceTransformer will throw an OSError.

In your case, your path (โ€˜/dbfs/tmpโ€™) is not being recognized as a valid repository ID. The repository ID should be โ€˜repo_nameโ€™ or โ€˜namespace/repo_nameโ€™. If โ€˜/dbfs/tmpโ€™ is a local directory where your model is stored, you might want to ensure that the directory exists and contains the necessary model files.

 

If youโ€™re trying to load a model from a local directory, make sure to provide the absolute path to t....

 

I hope this helps! Let me know if you have any other questions.

sanjay
Valued Contributor II

Thank you @Kaniz_Fatma for prompt response. I am able to load the model from Huggingface correctly. The issue is with loading input audio file which is stored in local directory in databrick dbfs. This error is coming only after creating serverless endpoint. Without serverless endpoint, I am able to load model from registry and read incoming audio file from same location and process it.

Hi @sanjay, I see, it seems like the issue is with accessing the local file from a serverless endpoint. This could be due to the fact that serverless functions often have different access permissions and ....

 

When you create a serverless endpoint, the function is executed in a different environment where the local filesystem might not be directly accessible. This is especially true if the function is running in a containerized environment or on a different ....

 

In the case of Databricks and DBFS (Databricks File System), files that are stored in DBFS can be accessed like they are in the local file system. However, the access path might be different when youโ€™re running in a serverless context.

 

Here are a few things you could try:

 

Check the file path: Make sure that the file path youโ€™re providing is accessible from the serverless function. In Databricks, you can access DBFS paths directly as if they were local paths, but the exact path wo....

Check the permissions: Ensure that the serverless function has the necessary permissions to access the file. This could mean setting the appropriate IAM roles in AWS, or the equivalent in other cloud providers, that allow access to the necessary resources.

Use absolute paths: If youโ€™re using relative paths to access the file, try using absolute paths instead. This can often help resolve issues where the file is not found.

Copy the file to a location accessible by the function: If the serverless function cannot access the local filesystem, you might need to copy the file to a location that the function can access. This could be a cloud storage bucket like S3 on AWS or Blob Storage on Azure.

 

If none of these suggestions help, could you provide more details about your setup? Specifically, information about how youโ€™re running the serverless function, how youโ€™re attempting to access the file in your code, and the exact error message youโ€™re seeing would be helpful. This will enable me to provide more targeted assistance.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group