cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Logging model to MLflow using Feature Store API. Getting TypeError: join() argument must be str, bytes, or os.PathLike object, not 'dict'

zachclem
New Contributor III

I'm using databricks. Trying to log a model to MLflow using the Feature Store log_model function. but I have this error: TypeError: join() argument must be str, bytes, or os.PathLike object, not 'dict' I'am using the Databricks runtime ml (10.4 LTS ML (includes Apache Spark 3.2.1, Scala 2.12)).

fs.log_model(
model,
artifact_path="fs_model",
flavor=mlflow.sklearn,
training_set=training_set
)

And here are the error logs.

TypeError: join() argument must be str, bytes, or os.PathLike object, not 'dict'
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
     62 if __name__ == "__main__":
     63     job = ModelTrainJob()
---> 64     job.launch()
 
/tmp/tmp51ge7k75.py in launch(self)
     56             env_vars=self.env_vars,
     57         )
---> 58         ModelTrain(cfg).run()
     59         _logger.info("ModelTrainJob job finished!")
     60 
 
/databricks/python/lib/python3.8/site-packages/customer_churn/objects/model_train.py in run(self)
    215             # Log model using Feature Store API
    216             _logger.info("Logging model to MLflow using Feature Store API")
--> 217             fs.log_model(
    218                 model,
    219                 artifact_path="fs_model",
 
/databricks/.python_edge_libs/databricks/feature_store/client.py in log_model(self, model, artifact_path, flavor, training_set, registered_model_name, await_registration_for, **kwargs)
   2106             # the databricks-feature-store package is not available via conda or pip.
   2107             conda_file = raw_mlflow_model.flavors["python_function"][mlflow.pyfunc.ENV]
-> 2108             conda_env = read_yaml(raw_model_path, conda_file)
   2109 
   2110             # Get the pip package string for the databricks-feature-lookup client
 
/databricks/python/lib/python3.8/site-packages/mlflow/utils/file_utils.py in read_yaml(root, file_name)
    210         )
    211 
--> 212     file_path = os.path.join(root, file_name)
    213     if not exists(file_path):
    214         raise MissingConfigException("Yaml file '%s' does not exist." % file_path)
 
/usr/lib/python3.8/posixpath.py in join(a, *p)
     88                 path += sep + b
     89     except (TypeError, AttributeError, BytesWarning):
---> 90         genericpath._check_arg_types('join', a, *p)
     91         raise
     92     return path
 
/usr/lib/python3.8/genericpath.py in _check_arg_types(funcname, *args)
    150             hasbytes = True
    151         else:
--> 152             raise TypeError(f'{funcname}() argument must be str, bytes, or '
    153                             f'os.PathLike object, not {s.__class__.__name__!r}') from None
    154     if hasstr and hasbytes:
 
TypeError: join() argument must be str, bytes, or os.PathLike object, not 'dict'

1 ACCEPTED SOLUTION

Accepted Solutions

zachclem
New Contributor III

I updated by Databricks Run Time from 10.4 to 12.1 and this solved the issue.

View solution in original post

2 REPLIES 2

Anonymous
Not applicable

@ZacharyHuh​ :

The error message suggests that the os.path.join() function is expecting a string, bytes or os.PathLike object, but it received a dictionary instead. Specifically, the error seems to be coming from the rIt looks like the error is occurring when MLflow is attempting to read a YAML file associated with the model. Specifically, the read_yaml function in the mlflow.utils.file_utils module is throwing the error because it's expecting a path string, but is instead receiving a dictionary object.

To fix this error, you may want to check the model object that you are passing to fs.log_model. It's possible that there is a dictionary in this object that is causing the issue. You may need to modify the model object to ensure that it only contains strings, bytes, or os.PathLike objects.

Hope this helps! Please ket us know otherwise.

zachclem
New Contributor III

I updated by Databricks Run Time from 10.4 to 12.1 and this solved the issue.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group