<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Serializing custom SparkMLlib Evaluator in Machine Learning</title>
    <link>https://community.databricks.com/t5/machine-learning/serializing-custom-sparkmllib-evaluator/m-p/74835#M3369</link>
    <description>&lt;P&gt;Hi guys,&lt;/P&gt;&lt;P&gt;We're facing a weird behavior or we're missing some configuration in our code. I've tried to find some information unsuccessfully. Let me try to explain our case, we have implemented a custom Evaluator in python using PySpark API, something like the following code snippet:&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;from pyspark.ml.evaluation import BinaryClassificationEvaluator,Evaluator
from pyspark.ml.util import DefaultParamsReadable,DefaultParamsWritable
import time
 
class BinaryEvaluator (BinaryClassificationEvaluator) :
    pass
 
class DummyEvaluator (Evaluator,DefaultParamsWritable,DefaultParamsReadable) :
    def __init__(self):
        self.uid=1
        self._paramMap = {}
        self._defaultParamMap = {}
        pass
 
    def _evaluate(self,dataset):
        return 0.9
   
    def isLargetBetter(self):
        return True&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The problem arises at the moment when the DummyEvaluator is serialized and stored in the StorageAccount. To do that, the following code is used:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;DummyEvaluator().save("abfss://mycontainer@mystacc.dfs.coore.windows.net/mymodel")&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;And the following error occurs:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;Failure to initialize configuration for storage account mystacc.dfs.core.windows.net: Invalid configuration value detected for fs.azure.account.keyInvalid configuration value detected for fs.azure.account.key
---------------------------------------------------------------------------
Py4JJavaError                             Traceback (most recent call last)
File &amp;lt;command-2244732439603884&amp;gt;, line 1
----&amp;gt; 1 DummyEvaluator().save(f'{base_path}/dummy_{int(time.time())}')

File /databricks/spark/python/pyspark/ml/util.py:355, in MLWritable.save(self, path)
    353 def save(self, path: str) -&amp;gt; None:
    354     """Save this ML instance to the given path, a shortcut of 'write().save(path)'."""
--&amp;gt; 355     self.write().save(path)

File /databricks/spark/python/pyspark/ml/util.py:249, in MLWriter.save(self, path)
    247 if self.shouldOverwrite:
    248     self._handleOverwrite(path)
--&amp;gt; 249 self.saveImpl(path)

File /databricks/spark/python/pyspark/ml/util.py:519, in DefaultParamsWriter.saveImpl(self, path)
    518 def saveImpl(self, path: str) -&amp;gt; None:
--&amp;gt; 519     DefaultParamsWriter.saveMetadata(self.instance, path, self.sc)

File /databricks/spark/python/pyspark/ml/util.py:559, in DefaultParamsWriter.saveMetadata(instance, path, sc, extraMetadata, paramMap)
    555 metadataPath = os.path.join(path, "metadata")
    556 metadataJson = DefaultParamsWriter._get_metadata_to_save(
    557     instance, sc, extraMetadata, paramMap
    558 )
--&amp;gt; 559 sc.parallelize([metadataJson], 1).saveAsTextFile(metadataPath)

File /databricks/spark/python/pyspark/instrumentation_utils.py:48, in _wrap_function.&amp;lt;locals&amp;gt;.wrapper(*args, **kwargs)
     46 start = time.perf_counter()
     47 try:
---&amp;gt; 48     res = func(*args, **kwargs)
     49     logger.log_success(
     50         module_name, class_name, function_name, time.perf_counter() - start, signature
     51     )
     52     return res

File /databricks/spark/python/pyspark/rdd.py:3457, in RDD.saveAsTextFile(self, path, compressionCodecClass)
   3455     self.ctx._jvm.PythonRDD.saveAsTextFileImpl(keyed._jrdd, path, compressionCodecClass)
   3456 else:
-&amp;gt; 3457     self.ctx._jvm.PythonRDD.saveAsTextFileImpl(keyed._jrdd, path)

File /databricks/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/java_gateway.py:1355, in JavaMember.__call__(self, *args)
   1349 command = proto.CALL_COMMAND_NAME +\
   1350     self.command_header +\
   1351     args_command +\
   1352     proto.END_COMMAND_PART
   1354 answer = self.gateway_client.send_command(command)
-&amp;gt; 1355 return_value = get_return_value(
   1356     answer, self.gateway_client, self.target_id, self.name)
   1358 for temp_arg in temp_args:
   1359     if hasattr(temp_arg, "_detach"):

File /databricks/spark/python/pyspark/errors/exceptions/captured.py:188, in capture_sql_exception.&amp;lt;locals&amp;gt;.deco(*a, **kw)
    186 def deco(*a: Any, **kw: Any) -&amp;gt; Any:
    187     try:
--&amp;gt; 188         return f(*a, **kw)
    189     except Py4JJavaError as e:
    190         converted = convert_exception(e.java_exception)

File /databricks/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/protocol.py:326, in get_return_value(answer, gateway_client, target_id, name)
    324 value = OUTPUT_CONVERTER[type](answer[2:], gateway_client)
    325 if answer[1] == REFERENCE_TYPE:
--&amp;gt; 326     raise Py4JJavaError(
    327         "An error occurred while calling {0}{1}{2}.\n".
    328         format(target_id, ".", name), value)
    329 else:
    330     raise Py4JError(
    331         "An error occurred while calling {0}{1}{2}. Trace:\n{3}\n".
    332         format(target_id, ".", name, value))

Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.saveAsTextFileImpl.
: Failure to initialize configuration for storage account mystacc.dfs.core.windows.net: Invalid configuration value detected for fs.azure.account.keyInvalid configuration value detected for fs.azure.account.key
	at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.services.SimpleKeyProvider.getStorageAccountKey(SimpleKeyProvider.java:52)
	at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.AbfsConfiguration.getStorageAccountKey(AbfsConfiguration.java:682)
	at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.initializeClient(AzureBlobFileSystemStore.java:2061)
	at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.&amp;lt;init&amp;gt;(AzureBlobFileSystemStore.java:266)
	at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.initialize(AzureBlobFileSystem.java:230)
	at com.databricks.common.filesystem.LokiABFS.initialize(LokiABFS.scala:36)
	at com.databricks.common.filesystem.LokiFileSystem$.$anonfun$getLokiFS$1(LokiFileSystem.scala:148)
	at com.databricks.common.filesystem.Cache.getOrCompute(Cache.scala:38)
	at com.databricks.common.filesystem.LokiFileSystem$.getLokiFS(LokiFileSystem.scala:145)
	at com.databricks.common.filesystem.LokiFileSystem.$anonfun$initialize$1(LokiFileSystem.scala:191)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
	at com.databricks.common.filesystem.LokiFileSystem.initialize(LokiFileSystem.scala:183)
	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3469)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:537)
	at org.apache.hadoop.fs.Path.getFileSystem(Path.java:365)
	at com.databricks.unity.SAM.createDelegate(SAM.scala:183)
	at com.databricks.unity.SAM.createDelegate$(SAM.scala:175)
	at com.databricks.unity.ClusterDefaultSAM$.createDelegate(SAM.scala:193)
	at com.databricks.sql.acl.fs.CredentialScopeFileSystem.createDelegate(CredentialScopeFileSystem.scala:83)
	at com.databricks.sql.acl.fs.CredentialScopeFileSystem.$anonfun$setDelegates$1(CredentialScopeFileSystem.scala:136)
	at com.databricks.sql.acl.fs.Lazy.apply(DelegatingFileSystem.scala:310)
	at com.databricks.sql.acl.fs.CredentialScopeFileSystem.getWorkingDirectory(CredentialScopeFileSystem.scala:259)
	at org.apache.spark.internal.io.SparkHadoopWriterUtils$.createPathFromString(SparkHadoopWriterUtils.scala:90)
	at org.apache.spark.rdd.PairRDDFunctions.$anonfun$saveAsHadoopFile$4(PairRDDFunctions.scala:1061)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:165)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:125)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
	at org.apache.spark.rdd.RDD.withScope(RDD.scala:451)
	at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:1027)
	at org.apache.spark.rdd.PairRDDFunctions.$anonfun$saveAsHadoopFile$3(PairRDDFunctions.scala:1009)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:165)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:125)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
	at org.apache.spark.rdd.RDD.withScope(RDD.scala:451)
	at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:1008)
	at org.apache.spark.rdd.PairRDDFunctions.$anonfun$saveAsHadoopFile$2(PairRDDFunctions.scala:965)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:165)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:125)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
	at org.apache.spark.rdd.RDD.withScope(RDD.scala:451)
	at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:963)
	at org.apache.spark.rdd.RDD.$anonfun$saveAsTextFile$2(RDD.scala:1680)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:165)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:125)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
	at org.apache.spark.rdd.RDD.withScope(RDD.scala:451)
	at org.apache.spark.rdd.RDD.saveAsTextFile(RDD.scala:1680)
	at org.apache.spark.rdd.RDD.$anonfun$saveAsTextFile$1(RDD.scala:1666)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:165)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:125)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
	at org.apache.spark.rdd.RDD.withScope(RDD.scala:451)
	at org.apache.spark.rdd.RDD.saveAsTextFile(RDD.scala:1666)
	at org.apache.spark.api.java.JavaRDDLike.saveAsTextFile(JavaRDDLike.scala:573)
	at org.apache.spark.api.java.JavaRDDLike.saveAsTextFile$(JavaRDDLike.scala:572)
	at org.apache.spark.api.java.AbstractJavaRDDLike.saveAsTextFile(JavaRDDLike.scala:45)
	at org.apache.spark.api.python.PythonRDD$._saveAsTextFile(PythonRDD.scala:931)
	at org.apache.spark.api.python.PythonRDD$.saveAsTextFileImpl(PythonRDD.scala:899)
	at org.apache.spark.api.python.PythonRDD.saveAsTextFileImpl(PythonRDD.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:397)
	at py4j.Gateway.invoke(Gateway.java:306)
	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
	at py4j.commands.CallCommand.execute(CallCommand.java:79)
	at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:199)
	at py4j.ClientServerConnection.run(ClientServerConnection.java:119)
	at java.lang.Thread.run(Thread.java:750)
Caused by: Invalid configuration value detected for fs.azure.account.key
	at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.diagnostics.ConfigurationBasicValidator.validate(ConfigurationBasicValidator.java:49)
	at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.diagnostics.Base64StringConfigurationBasicValidator.validate(Base64StringConfigurationBasicValidator.java:40)
	at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.services.SimpleKeyProvider.validateStorageAccountKey(SimpleKeyProvider.java:71)
	at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.services.SimpleKeyProvider.getStorageAccountKey(SimpleKeyProvider.java:49)
	... 76 more&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;But the BinaryEvaluator is serialized without any problems:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;BinaryEvaluator().save("abfss://mycontainer@mystacc.dfs.coore.windows.net/mymodel")&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The cluster has the configuration "Enable credential passthrough for user-level data access" set to True. The Storage Account container authentication method is Microsoft Entra User account.&lt;/P&gt;&lt;P&gt;How should the DummyEvaluator be implemented in order to be serialized like the BinaryEvaluator? Are we missing some trait that must be implemented too?&lt;/P&gt;&lt;P&gt;Thanks in advance for your help.&lt;/P&gt;&lt;P&gt;Regads,&lt;/P&gt;</description>
    <pubDate>Tue, 18 Jun 2024 10:52:09 GMT</pubDate>
    <dc:creator>CharlesFlores</dc:creator>
    <dc:date>2024-06-18T10:52:09Z</dc:date>
    <item>
      <title>Serializing custom SparkMLlib Evaluator</title>
      <link>https://community.databricks.com/t5/machine-learning/serializing-custom-sparkmllib-evaluator/m-p/74835#M3369</link>
      <description>&lt;P&gt;Hi guys,&lt;/P&gt;&lt;P&gt;We're facing a weird behavior or we're missing some configuration in our code. I've tried to find some information unsuccessfully. Let me try to explain our case, we have implemented a custom Evaluator in python using PySpark API, something like the following code snippet:&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;from pyspark.ml.evaluation import BinaryClassificationEvaluator,Evaluator
from pyspark.ml.util import DefaultParamsReadable,DefaultParamsWritable
import time
 
class BinaryEvaluator (BinaryClassificationEvaluator) :
    pass
 
class DummyEvaluator (Evaluator,DefaultParamsWritable,DefaultParamsReadable) :
    def __init__(self):
        self.uid=1
        self._paramMap = {}
        self._defaultParamMap = {}
        pass
 
    def _evaluate(self,dataset):
        return 0.9
   
    def isLargetBetter(self):
        return True&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The problem arises at the moment when the DummyEvaluator is serialized and stored in the StorageAccount. To do that, the following code is used:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;DummyEvaluator().save("abfss://mycontainer@mystacc.dfs.coore.windows.net/mymodel")&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;And the following error occurs:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;Failure to initialize configuration for storage account mystacc.dfs.core.windows.net: Invalid configuration value detected for fs.azure.account.keyInvalid configuration value detected for fs.azure.account.key
---------------------------------------------------------------------------
Py4JJavaError                             Traceback (most recent call last)
File &amp;lt;command-2244732439603884&amp;gt;, line 1
----&amp;gt; 1 DummyEvaluator().save(f'{base_path}/dummy_{int(time.time())}')

File /databricks/spark/python/pyspark/ml/util.py:355, in MLWritable.save(self, path)
    353 def save(self, path: str) -&amp;gt; None:
    354     """Save this ML instance to the given path, a shortcut of 'write().save(path)'."""
--&amp;gt; 355     self.write().save(path)

File /databricks/spark/python/pyspark/ml/util.py:249, in MLWriter.save(self, path)
    247 if self.shouldOverwrite:
    248     self._handleOverwrite(path)
--&amp;gt; 249 self.saveImpl(path)

File /databricks/spark/python/pyspark/ml/util.py:519, in DefaultParamsWriter.saveImpl(self, path)
    518 def saveImpl(self, path: str) -&amp;gt; None:
--&amp;gt; 519     DefaultParamsWriter.saveMetadata(self.instance, path, self.sc)

File /databricks/spark/python/pyspark/ml/util.py:559, in DefaultParamsWriter.saveMetadata(instance, path, sc, extraMetadata, paramMap)
    555 metadataPath = os.path.join(path, "metadata")
    556 metadataJson = DefaultParamsWriter._get_metadata_to_save(
    557     instance, sc, extraMetadata, paramMap
    558 )
--&amp;gt; 559 sc.parallelize([metadataJson], 1).saveAsTextFile(metadataPath)

File /databricks/spark/python/pyspark/instrumentation_utils.py:48, in _wrap_function.&amp;lt;locals&amp;gt;.wrapper(*args, **kwargs)
     46 start = time.perf_counter()
     47 try:
---&amp;gt; 48     res = func(*args, **kwargs)
     49     logger.log_success(
     50         module_name, class_name, function_name, time.perf_counter() - start, signature
     51     )
     52     return res

File /databricks/spark/python/pyspark/rdd.py:3457, in RDD.saveAsTextFile(self, path, compressionCodecClass)
   3455     self.ctx._jvm.PythonRDD.saveAsTextFileImpl(keyed._jrdd, path, compressionCodecClass)
   3456 else:
-&amp;gt; 3457     self.ctx._jvm.PythonRDD.saveAsTextFileImpl(keyed._jrdd, path)

File /databricks/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/java_gateway.py:1355, in JavaMember.__call__(self, *args)
   1349 command = proto.CALL_COMMAND_NAME +\
   1350     self.command_header +\
   1351     args_command +\
   1352     proto.END_COMMAND_PART
   1354 answer = self.gateway_client.send_command(command)
-&amp;gt; 1355 return_value = get_return_value(
   1356     answer, self.gateway_client, self.target_id, self.name)
   1358 for temp_arg in temp_args:
   1359     if hasattr(temp_arg, "_detach"):

File /databricks/spark/python/pyspark/errors/exceptions/captured.py:188, in capture_sql_exception.&amp;lt;locals&amp;gt;.deco(*a, **kw)
    186 def deco(*a: Any, **kw: Any) -&amp;gt; Any:
    187     try:
--&amp;gt; 188         return f(*a, **kw)
    189     except Py4JJavaError as e:
    190         converted = convert_exception(e.java_exception)

File /databricks/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/protocol.py:326, in get_return_value(answer, gateway_client, target_id, name)
    324 value = OUTPUT_CONVERTER[type](answer[2:], gateway_client)
    325 if answer[1] == REFERENCE_TYPE:
--&amp;gt; 326     raise Py4JJavaError(
    327         "An error occurred while calling {0}{1}{2}.\n".
    328         format(target_id, ".", name), value)
    329 else:
    330     raise Py4JError(
    331         "An error occurred while calling {0}{1}{2}. Trace:\n{3}\n".
    332         format(target_id, ".", name, value))

Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.saveAsTextFileImpl.
: Failure to initialize configuration for storage account mystacc.dfs.core.windows.net: Invalid configuration value detected for fs.azure.account.keyInvalid configuration value detected for fs.azure.account.key
	at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.services.SimpleKeyProvider.getStorageAccountKey(SimpleKeyProvider.java:52)
	at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.AbfsConfiguration.getStorageAccountKey(AbfsConfiguration.java:682)
	at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.initializeClient(AzureBlobFileSystemStore.java:2061)
	at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.&amp;lt;init&amp;gt;(AzureBlobFileSystemStore.java:266)
	at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.initialize(AzureBlobFileSystem.java:230)
	at com.databricks.common.filesystem.LokiABFS.initialize(LokiABFS.scala:36)
	at com.databricks.common.filesystem.LokiFileSystem$.$anonfun$getLokiFS$1(LokiFileSystem.scala:148)
	at com.databricks.common.filesystem.Cache.getOrCompute(Cache.scala:38)
	at com.databricks.common.filesystem.LokiFileSystem$.getLokiFS(LokiFileSystem.scala:145)
	at com.databricks.common.filesystem.LokiFileSystem.$anonfun$initialize$1(LokiFileSystem.scala:191)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
	at com.databricks.common.filesystem.LokiFileSystem.initialize(LokiFileSystem.scala:183)
	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3469)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:537)
	at org.apache.hadoop.fs.Path.getFileSystem(Path.java:365)
	at com.databricks.unity.SAM.createDelegate(SAM.scala:183)
	at com.databricks.unity.SAM.createDelegate$(SAM.scala:175)
	at com.databricks.unity.ClusterDefaultSAM$.createDelegate(SAM.scala:193)
	at com.databricks.sql.acl.fs.CredentialScopeFileSystem.createDelegate(CredentialScopeFileSystem.scala:83)
	at com.databricks.sql.acl.fs.CredentialScopeFileSystem.$anonfun$setDelegates$1(CredentialScopeFileSystem.scala:136)
	at com.databricks.sql.acl.fs.Lazy.apply(DelegatingFileSystem.scala:310)
	at com.databricks.sql.acl.fs.CredentialScopeFileSystem.getWorkingDirectory(CredentialScopeFileSystem.scala:259)
	at org.apache.spark.internal.io.SparkHadoopWriterUtils$.createPathFromString(SparkHadoopWriterUtils.scala:90)
	at org.apache.spark.rdd.PairRDDFunctions.$anonfun$saveAsHadoopFile$4(PairRDDFunctions.scala:1061)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:165)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:125)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
	at org.apache.spark.rdd.RDD.withScope(RDD.scala:451)
	at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:1027)
	at org.apache.spark.rdd.PairRDDFunctions.$anonfun$saveAsHadoopFile$3(PairRDDFunctions.scala:1009)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:165)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:125)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
	at org.apache.spark.rdd.RDD.withScope(RDD.scala:451)
	at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:1008)
	at org.apache.spark.rdd.PairRDDFunctions.$anonfun$saveAsHadoopFile$2(PairRDDFunctions.scala:965)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:165)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:125)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
	at org.apache.spark.rdd.RDD.withScope(RDD.scala:451)
	at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:963)
	at org.apache.spark.rdd.RDD.$anonfun$saveAsTextFile$2(RDD.scala:1680)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:165)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:125)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
	at org.apache.spark.rdd.RDD.withScope(RDD.scala:451)
	at org.apache.spark.rdd.RDD.saveAsTextFile(RDD.scala:1680)
	at org.apache.spark.rdd.RDD.$anonfun$saveAsTextFile$1(RDD.scala:1666)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:165)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:125)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
	at org.apache.spark.rdd.RDD.withScope(RDD.scala:451)
	at org.apache.spark.rdd.RDD.saveAsTextFile(RDD.scala:1666)
	at org.apache.spark.api.java.JavaRDDLike.saveAsTextFile(JavaRDDLike.scala:573)
	at org.apache.spark.api.java.JavaRDDLike.saveAsTextFile$(JavaRDDLike.scala:572)
	at org.apache.spark.api.java.AbstractJavaRDDLike.saveAsTextFile(JavaRDDLike.scala:45)
	at org.apache.spark.api.python.PythonRDD$._saveAsTextFile(PythonRDD.scala:931)
	at org.apache.spark.api.python.PythonRDD$.saveAsTextFileImpl(PythonRDD.scala:899)
	at org.apache.spark.api.python.PythonRDD.saveAsTextFileImpl(PythonRDD.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:397)
	at py4j.Gateway.invoke(Gateway.java:306)
	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
	at py4j.commands.CallCommand.execute(CallCommand.java:79)
	at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:199)
	at py4j.ClientServerConnection.run(ClientServerConnection.java:119)
	at java.lang.Thread.run(Thread.java:750)
Caused by: Invalid configuration value detected for fs.azure.account.key
	at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.diagnostics.ConfigurationBasicValidator.validate(ConfigurationBasicValidator.java:49)
	at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.diagnostics.Base64StringConfigurationBasicValidator.validate(Base64StringConfigurationBasicValidator.java:40)
	at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.services.SimpleKeyProvider.validateStorageAccountKey(SimpleKeyProvider.java:71)
	at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.services.SimpleKeyProvider.getStorageAccountKey(SimpleKeyProvider.java:49)
	... 76 more&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;But the BinaryEvaluator is serialized without any problems:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;BinaryEvaluator().save("abfss://mycontainer@mystacc.dfs.coore.windows.net/mymodel")&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The cluster has the configuration "Enable credential passthrough for user-level data access" set to True. The Storage Account container authentication method is Microsoft Entra User account.&lt;/P&gt;&lt;P&gt;How should the DummyEvaluator be implemented in order to be serialized like the BinaryEvaluator? Are we missing some trait that must be implemented too?&lt;/P&gt;&lt;P&gt;Thanks in advance for your help.&lt;/P&gt;&lt;P&gt;Regads,&lt;/P&gt;</description>
      <pubDate>Tue, 18 Jun 2024 10:52:09 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/serializing-custom-sparkmllib-evaluator/m-p/74835#M3369</guid>
      <dc:creator>CharlesFlores</dc:creator>
      <dc:date>2024-06-18T10:52:09Z</dc:date>
    </item>
    <item>
      <title>Re: Serializing custom SparkMLlib Evaluator</title>
      <link>https://community.databricks.com/t5/machine-learning/serializing-custom-sparkmllib-evaluator/m-p/75012#M3373</link>
      <description>&lt;P&gt;Hi, &lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/9"&gt;@Retired_mod&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks for your reply. I cannot understand why the serialization of the BinaryEvaluator works without this configuration set, but the DummyEvaluator does not.&lt;/P&gt;&lt;P&gt;Both instances are created under the same SparkSession (in the same notebook). If the configuration is missing, neither of the instances can be serialized.&lt;/P&gt;</description>
      <pubDate>Wed, 19 Jun 2024 14:44:13 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/serializing-custom-sparkmllib-evaluator/m-p/75012#M3373</guid>
      <dc:creator>CharlesFlores</dc:creator>
      <dc:date>2024-06-19T14:44:13Z</dc:date>
    </item>
  </channel>
</rss>

