<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Pushing SparkNLP Model on Mlflow in Machine Learning</title>
    <link>https://community.databricks.com/t5/machine-learning/pushing-sparknlp-model-on-mlflow/m-p/17818#M976</link>
    <description>&lt;P&gt;آموزش طراحی سایت &lt;A href="https://arzgu.ir/blog/What%20is%20website%20design" target="test_blank"&gt;https://arzgu.ir/blog/What%20is%20website%20design&lt;/A&gt;&lt;/P&gt;</description>
    <pubDate>Sat, 04 Mar 2023 07:31:55 GMT</pubDate>
    <dc:creator>tala</dc:creator>
    <dc:date>2023-03-04T07:31:55Z</dc:date>
    <item>
      <title>Pushing SparkNLP Model on Mlflow</title>
      <link>https://community.databricks.com/t5/machine-learning/pushing-sparknlp-model-on-mlflow/m-p/17816#M974</link>
      <description>&lt;P&gt;Hello Everyone, &lt;/P&gt;&lt;P&gt;I am trying to load a SparkNLP (l&lt;A href="https://nlp.johnsnowlabs.com/2020/12/05/detect_language_375_xx.html" alt="https://nlp.johnsnowlabs.com/2020/12/05/detect_language_375_xx.html" target="_blank"&gt;ink&lt;/A&gt; for more details about the model if required) from Mlflow Registry. &lt;/P&gt;&lt;P&gt;To this end, I have followed one tutorial and implemented below codes:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;import mlflow.pyfunc
&amp;nbsp;
class LangDetectionModel(mlflow.pyfunc.PythonModel):
    def __init__(self):
      super().__init__()
      from sparknlp.pretrained import PretrainedPipeline
      from sparknlp.pretrained import PipelineModel 
      # embed the sparknlp model 
      self._model  = PipelineModel.load("/mnt/sparknlp_models/detect_language_375/")
    def predict(self, eval_data_lang_detect):
    # Apply the transform function for lang detetction
      list_columns = eval_data_lang_detect.columns 
      model_output =self._model.transform(eval_data_lang_detect).select(list_columns+ [F.col("language.result").getItem(0)]).withColumnRenamed('language.result[0]','sparknlp_column')
      return model_output
model_path = "my-langdetect-model"
reg_model_name = "NlpieLangDetection"
sparknlp_model = LangDetectionModel()&lt;/CODE&gt;&lt;/PRE&gt;&lt;PRE&gt;&lt;CODE&gt;# Log MLflow entities and save the model
mlflow.set_tracking_uri("sqlite:///mlruns.db")
&amp;nbsp;
# Save the conda environment for this model.
conda_env = {
    'channels': ['defaults', 'conda-forge'],
    'dependencies': [
        'python={}'.format(PYTHON_VERSION),
        'pip'],
    'pip': [
        'mlflow',
        'cloudpickle=={}'.format(cloudpickle.__version__),
        'NlpieLangDetection==0.0.1'
    ],
    'name': 'mlflow-env'
}&lt;/CODE&gt;&lt;/PRE&gt;&lt;PRE&gt;&lt;CODE&gt;# Save the model
mlflow.set_experiment('/Users/Youssef.Meguebli@sanofi.com/Language_Detection_Translation/LangDetectionTest')
with mlflow.start_run(run_name="Nlpie Language Detection") as run:
    model_path = f"{model_path}-{run.info.run_uuid}"
    mlflow.log_param("algorithm", "SparNLPLangDetection")
    mlflow.pyfunc.save_model(path=model_path, python_model=sparknlp_model, conda_env=conda_env)&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;I am getting an error on last piece of code where I am trying to save the model on Mlflow registry.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Below the error get I am getting:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;TypeError: cannot pickle '_thread.RLock' object
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
&amp;lt;command-2121909764500367&amp;gt; in &amp;lt;module&amp;gt;
      4     model_path = f"{model_path}-{run.info.run_uuid}"
      5     mlflow.log_param("algorithm", "SparNLPLangDetection")
----&amp;gt; 6     mlflow.pyfunc.save_model(path=model_path, python_model=sparknlp_model, conda_env=conda_env)
&amp;nbsp;
/databricks/python/lib/python3.8/site-packages/mlflow/pyfunc/__init__.py in save_model(path, loader_module, data_path, code_path, conda_env, mlflow_model, python_model, artifacts, signature, input_example, pip_requirements, extra_pip_requirements, **kwargs)
   1467         )
   1468     elif second_argument_set_specified:
-&amp;gt; 1469         return mlflow.pyfunc.model._save_model_with_class_artifacts_params(
   1470             path=path,
   1471             python_model=python_model,
&amp;nbsp;
/databricks/python/lib/python3.8/site-packages/mlflow/pyfunc/model.py in _save_model_with_class_artifacts_params(path, python_model, artifacts, conda_env, code_paths, mlflow_model, pip_requirements, extra_pip_requirements)
    162         saved_python_model_subpath = "python_model.pkl"
    163         with open(os.path.join(path, saved_python_model_subpath), "wb") as out:
--&amp;gt; 164             cloudpickle.dump(python_model, out)
    165         custom_model_config_kwargs[CONFIG_KEY_PYTHON_MODEL] = saved_python_model_subpath
    166     else:
&amp;nbsp;
/databricks/python/lib/python3.8/site-packages/cloudpickle/cloudpickle_fast.py in dump(obj, file, protocol, buffer_callback)
     53         compatibility with older versions of Python.
     54         """
---&amp;gt; 55         CloudPickler(
     56             file, protocol=protocol, buffer_callback=buffer_callback
     57         ).dump(obj)
&amp;nbsp;
/databricks/python/lib/python3.8/site-packages/cloudpickle/cloudpickle_fast.py in dump(self, obj)
    631     def dump(self, obj):
    632         try:
--&amp;gt; 633             return Pickler.dump(self, obj)
    634         except RuntimeError as e:
    635             if "recursion" in e.args[0]:
&amp;nbsp;
TypeError: cannot pickle '_thread.RLock' object&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Please let me know if you need any further details. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Many Thanks in advance for your support. &lt;/P&gt;</description>
      <pubDate>Mon, 13 Jun 2022 10:46:26 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/pushing-sparknlp-model-on-mlflow/m-p/17816#M974</guid>
      <dc:creator>Youssef1985</dc:creator>
      <dc:date>2022-06-13T10:46:26Z</dc:date>
    </item>
    <item>
      <title>Re: Pushing SparkNLP Model on Mlflow</title>
      <link>https://community.databricks.com/t5/machine-learning/pushing-sparknlp-model-on-mlflow/m-p/17817#M975</link>
      <description>&lt;P&gt;Hi. &lt;/P&gt;&lt;P&gt;The problem might be with pickling a language model.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Have you tried to use&amp;nbsp;&lt;A href="https://mlflow.org/docs/latest/python_api/mlflow.spark.html#mlflow.spark.log_model" alt="https://mlflow.org/docs/latest/python_api/mlflow.spark.html#mlflow.spark.log_model" target="_blank"&gt;mlflow.spark.log_model&lt;/A&gt;&amp;nbsp;to save the model? Spark ML models cannot be serialized as pickle files. They are serialized in a language-neutral hierarchical format that can be read by both Python and Scala as in the "sparkml" directory below.&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;spark-model
+-sparkml/
| +-stages/
| | +-1_DecisionTreeRegressor_6aae1e6c3fed/
| | | +-data/
| | | | +-part-00000-a4b9cb99-abd2-40c3-90d2-a46b44926263-c000.snappy.parquet
| | | | +-.part-00000-a4b9cb99-abd2-40c3-90d2-a46b44926263-c000.snappy.parquet.crc
| | | | +-._SUCCESS.crc
| | | |
| | | +-metadata/
| | |   +-part-00000
| | |   +-.part-00000.crc
| | |   +-._SUCCESS.crc
| | |  
| | +-0_VectorAssembler_ce8bcea8c5b3/
| |   +-metadata/
| |     +-part-00000
| |     +-.part-00000.crc
| |     +-._SUCCESS.crc
| |    
| +-metadata/
|   +-part-00000
|   +-.part-00000.crc
|   +-._SUCCESS.crc
|  
+-requirements.txt
+-python_env.yaml
+-conda.yaml
+-MLmodel&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Another resource on Models an deployment is this post on Medium "&lt;A href="https://santiagof.medium.com/effortless-models-deployment-with-mlflow-packing-a-nlp-product-review-classifier-from-huggingface-13be2650333" alt="https://santiagof.medium.com/effortless-models-deployment-with-mlflow-packing-a-nlp-product-review-classifier-from-huggingface-13be2650333" target="_blank"&gt;Effortless models deployment with Mlflow — Packing a NLP product review classifier from HuggingFace&lt;/A&gt;" (https://santiagof.medium.com/effortless-models-deployment-with-mlflow-packing-a-nlp-product-review-classifier-from-huggingface-13be2650333) &lt;/P&gt;</description>
      <pubDate>Fri, 03 Mar 2023 17:17:15 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/pushing-sparknlp-model-on-mlflow/m-p/17817#M975</guid>
      <dc:creator>Kari</dc:creator>
      <dc:date>2023-03-03T17:17:15Z</dc:date>
    </item>
    <item>
      <title>Re: Pushing SparkNLP Model on Mlflow</title>
      <link>https://community.databricks.com/t5/machine-learning/pushing-sparknlp-model-on-mlflow/m-p/17818#M976</link>
      <description>&lt;P&gt;آموزش طراحی سایت &lt;A href="https://arzgu.ir/blog/What%20is%20website%20design" target="test_blank"&gt;https://arzgu.ir/blog/What%20is%20website%20design&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 04 Mar 2023 07:31:55 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/pushing-sparknlp-model-on-mlflow/m-p/17818#M976</guid>
      <dc:creator>tala</dc:creator>
      <dc:date>2023-03-04T07:31:55Z</dc:date>
    </item>
  </channel>
</rss>

