cancel
Showing results for 
Search instead for 
Did you mean: 
Community Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results for 
Search instead for 
Did you mean: 

Pystan3 with databricks runtime 13.3

sara-aliza
New Contributor

I am getting a read-only error when running pystan3 build() in a UDF.
I think the issue is related to the location that the code is being run from, which is read-only. I am looking to set a custom cache location inside the UDF. Based on this link  I am able to change the cache_directory by setting

 

 

 

. But how can I set it inside the UDF to write it in a location that has the write access other than the worker node?

More context, I dont get this issue when running under workspace:

If I run the bellow python script under a repo I created here

 

 

 

Workspace->Repos->{user}->{cloned_repo} ,

 

 

 

it gives me error which I added. This is also the case when running python as a workflow job having the source as “Git Provider” and specifying the path to the notebook.

But if I run the python script under a workspace I created here

 

 

 

Workspace-> Workspace -> Users-> {user} ,

 

 

 

it works fine.

 

 

 

 

 

PythonException: 
  An exception was thrown from the Python worker. Please see the stack trace below.
'RuntimeError: Exception while building model extension module: `DistutilsFileError("could not create 'build/temp.linux-x86_64-cpython-310/root/.cache/httpstan/4.12.0/models/yk53ta4w': Read-only file system")`, traceback: `['  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-cd1c15bf-7ad4-4846-a4ff-3beaa207767c/lib/python3.10/site-packages/httpstan/views.py", line 114, in handle_create_model\n    compiler_output = await httpstan.models.build_services_extension_module(program_code)\n', '  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-cd1c15bf-7ad4-4846-a4ff-3beaa207767c/lib/python3.10/site-packages/httpstan/models.py", line 172, in build_services_extension_module\n    compiler_output = await asyncio.get_running_loop().run_in_executor(\n', '  File "/usr/lib/python3.10/asyncio/futures.py", line 285, in __await__\n    yield self  # This tells Task to wait for completion.\n', '  File "/usr/lib/python3.10/asyncio/tasks.py", line 304, in __wakeup\n    future.result()\n', '  File "/usr/lib/python3.10/asyncio/futures.py", line 201, in result\n    raise self._exception.with_traceback(self._exception_tb)\n', '  File "/usr/lib/python3.10/concurrent/futures/thread.py", line 58, in run\n    result = self.fn(*self.args, **self.kwargs)\n', '  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-cd1c15bf-7ad4-4846-a4ff-3beaa207767c/lib/python3.10/site-packages/httpstan/build_ext.py", line 86, in run_build_ext\n    build_extension.run()\n', '  File "/databricks/python/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 79, in run\n    _build_ext.run(self)\n', '  File "/databricks/python/lib/python3.10/site-packages/Cython/Distutils/old_build_ext.py", line 186, in run\n    _build_ext.build_ext.run(self)\n', '  File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 346, in run\n    self.build_extensions()\n', '  File "/databricks/python/lib/python3.10/site-packages/Cython/Distutils/old_build_ext.py", line 195, in build_extensions\n    _build_ext.build_ext.build_extensions(self)\n', '  File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 466, in build_extensions\n    self._build_extensions_serial()\n', '  File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 492, in _build_extensions_serial\n    self.build_extension(ext)\n', '  File "/databricks/python/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 202, in build_extension\n    _build_ext.build_extension(self, ext)\n', '  File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 547, in build_extension\n    objects = self.compiler.compile(\n', '  File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/ccompiler.py", line 589, in compile\n    macros, objects, extra_postargs, pp_opts, build = self._setup_compile(\n', '  File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/ccompiler.py", line 360, in _setup_compile\n    self.mkpath(os.path.dirname(obj))\n', '  File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/ccompiler.py", line 993, in mkpath\n    mkpath(name, mode, dry_run=self.dry_run)\n', '  File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/dir_util.py", line 78, in mkpath\n    raise DistutilsFileError(\n']`', from /root/.ipykernel/3060/command-3851521498225777-2537282871, line 2066. Full traceback below:
Traceback (most recent call last):
  File "/root/.ipykernel/3060/command-3851521498225777-2537282871", line 2066, in elasticity_udf
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-cd1c15bf-7ad4-4846-a4ff-3beaa207767c/lib/python3.10/site-packages/stan/model.py", line 520, in build
    except KeyboardInterrupt:
  File "/databricks/python/lib/python3.10/site-packages/nest_asyncio.py", line 39, in run
    with suppress(asyncio.CancelledError):
  File "/databricks/python/lib/python3.10/site-packages/nest_asyncio.py", line 89, in run_until_complete
    return f.result()
  File "/usr/lib/python3.10/asyncio/futures.py", line 201, in result
    raise self._exception.with_traceback(self._exception_tb)
  File "/usr/lib/python3.10/asyncio/tasks.py", line 300, in __step
    self = None  # Needed to break cycles when an exception occurs.
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-cd1c15bf-7ad4-4846-a4ff-3beaa207767c/lib/python3.10/site-packages/stan/model.py", line 488, in go
    raise RuntimeError(resp.json()["message"])
RuntimeError: Exception while building model extension module: `DistutilsFileError("could not create 'build/temp.linux-x86_64-cpython-310/root/.cache/httpstan/4.12.0/models/yk53ta4w': Read-only file system")`, traceback: `['  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-cd1c15bf-7ad4-4846-a4ff-3beaa207767c/lib/python3.10/site-packages/httpstan/views.py", line 114, in handle_create_model\n    compiler_output = await httpstan.models.build_services_extension_module(program_code)\n', '  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-cd1c15bf-7ad4-4846-a4ff-3beaa207767c/lib/python3.10/site-packages/httpstan/models.py", line 172, in build_services_extension_module\n    compiler_output = await asyncio.get_running_loop().run_in_executor(\n', '  File "/usr/lib/python3.10/asyncio/futures.py", line 285, in __await__\n    yield self  # This tells Task to wait for completion.\n', '  File "/usr/lib/python3.10/asyncio/tasks.py", line 304, in __wakeup\n    future.result()\n', '  File "/usr/lib/python3.10/asyncio/futures.py", line 201, in result\n    raise self._exception.with_traceback(self._exception_tb)\n', '  File "/usr/lib/python3.10/concurrent/futures/thread.py", line 58, in run\n    result = self.fn(*self.args, **self.kwargs)\n', '  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-cd1c15bf-7ad4-4846-a4ff-3beaa207767c/lib/python3.10/site-packages/httpstan/build_ext.py", line 86, in run_build_ext\n    build_extension.run()\n', '  File "/databricks/python/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 79, in run\n    _build_ext.run(self)\n', '  File "/databricks/python/lib/python3.10/site-packages/Cython/Distutils/old_build_ext.py", line 186, in run\n    _build_ext.build_ext.run(self)\n', '  File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 346, in run\n    self.build_extensions()\n', '  File "/databricks/python/lib/python3.10/site-packages/Cython/Distutils/old_build_ext.py", line 195, in build_extensions\n    _build_ext.build_ext.build_extensions(self)\n', '  File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 466, in build_extensions\n    self._build_extensions_serial()\n', '  File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 492, in _build_extensions_serial\n    self.build_extension(ext)\n', '  File "/databricks/python/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 202, in build_extension\n    _build_ext.build_extension(self, ext)\n', '  File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 547, in build_extension\n    objects = self.compiler.compile(\n', '  File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/ccompiler.py", line 589, in compile\n    macros, objects, extra_postargs, pp_opts, build = self._setup_compile(\n', '  File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/ccompiler.py", line 360, in _setup_compile\n    self.mkpath(os.path.dirname(obj))\n', '  File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/ccompiler.py", line 993, in mkpath\n    mkpath(name, mode, dry_run=self.dry_run)\n', '  File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/dir_util.py", line 78, in mkpath\n    raise DistutilsFileError(\n']`
---------------------------------------------------------------------------
PythonException                           Traceback (most recent call last)
File <command-2300127166286607>, line 8
      3 df_out = df_input_small.groupby("Exec_Id").applyInPandas(
      4     elasticity_udf,
      5     schema="Store_Physical_Location_No string, Exec_Id string, Item_No string",
      6 )
      7 # DOES NOT WORK
----> 8 df_out.count()

File /databricks/spark/python/pyspark/instrumentation_utils.py:48, in _wrap_function.<locals>.wrapper(*args, **kwargs)
     46 start = time.perf_counter()
     47 try:
---> 48     res = func(*args, **kwargs)
     49     logger.log_success(
     50         module_name, class_name, function_name, time.perf_counter() - start, signature
     51     )
     52     return res

File /databricks/spark/python/pyspark/sql/dataframe.py:1206, in DataFrame.count(self)
   1183 def count(self) -> int:
   1184     """Returns the number of rows in this :class:`DataFrame`.
   1185 
   1186     .. versionadded:: 1.3.0
   (...)
   1204     3
   1205     """
-> 1206     return int(self._jdf.count())

File /databricks/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/java_gateway.py:1355, in JavaMember.__call__(self, *args)
   1349 command = proto.CALL_COMMAND_NAME +\
   1350     self.command_header +\
   1351     args_command +\
   1352     proto.END_COMMAND_PART
   1354 answer = self.gateway_client.send_command(command)
-> 1355 return_value = get_return_value(
   1356     answer, self.gateway_client, self.target_id, self.name)
   1358 for temp_arg in temp_args:
   1359     if hasattr(temp_arg, "_detach"):

File /databricks/spark/python/pyspark/errors/exceptions/captured.py:194, in capture_sql_exception.<locals>.deco(*a, **kw)
    190 converted = convert_exception(e.java_exception)
    191 if not isinstance(converted, UnknownException):
    192     # Hide where the exception came from that shows a non-Pythonic
    193     # JVM exception message.
--> 194     raise converted from None
    195 else:
    196     raise

PythonException: 
  An exception was thrown from the Python worker. Please see the stack trace below.
'RuntimeError: Exception while building model extension module: `DistutilsFileError("could not create 'build/temp.linux-x86_64-cpython-310/root/.cache/httpstan/4.12.0/models/yk53ta4w': Read-only file system")`, traceback: `['  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-cd1c15bf-7ad4-4846-a4ff-3beaa207767c/lib/python3.10/site-packages/httpstan/views.py", line 114, in handle_create_model\n    compiler_output = await httpstan.models.build_services_extension_module(program_code)\n', '  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-cd1c15bf-7ad4-4846-a4ff-3beaa207767c/lib/python3.10/site-packages/httpstan/models.py", line 172, in build_services_extension_module\n    compiler_output = await asyncio.get_running_loop().run_in_executor(\n', '  File "/usr/lib/python3.10/asyncio/futures.py", line 285, in __await__\n    yield self  # This tells Task to wait for completion.\n', '  File "/usr/lib/python3.10/asyncio/tasks.py", line 304, in __wakeup\n    future.result()\n', '  File "/usr/lib/python3.10/asyncio/futures.py", line 201, in result\n    raise self._exception.with_traceback(self._exception_tb)\n', '  File "/usr/lib/python3.10/concurrent/futures/thread.py", line 58, in run\n    result = self.fn(*self.args, **self.kwargs)\n', '  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-cd1c15bf-7ad4-4846-a4ff-3beaa207767c/lib/python3.10/site-packages/httpstan/build_ext.py", line 86, in run_build_ext\n    build_extension.run()\n', '  File "/databricks/python/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 79, in run\n    _build_ext.run(self)\n', '  File "/databricks/python/lib/python3.10/site-packages/Cython/Distutils/old_build_ext.py", line 186, in run\n    _build_ext.build_ext.run(self)\n', '  File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 346, in run\n    self.build_extensions()\n', '  File "/databricks/python/lib/python3.10/site-packages/Cython/Distutils/old_build_ext.py", line 195, in build_extensions\n    _build_ext.build_ext.build_extensions(self)\n', '  File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 466, in build_extensions\n    self._build_extensions_serial()\n', '  File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 492, in _build_extensions_serial\n    self.build_extension(ext)\n', '  File "/databricks/python/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 202, in build_extension\n    _build_ext.build_extension(self, ext)\n', '  File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 547, in build_extension\n    objects = self.compiler.compile(\n', '  File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/ccompiler.py", line 589, in compile\n    macros, objects, extra_postargs, pp_opts, build = self._setup_compile(\n', '  File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/ccompiler.py", line 360, in _setup_compile\n    self.mkpath(os.path.dirname(obj))\n', '  File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/ccompiler.py", line 993, in mkpath\n    mkpath(name, mode, dry_run=self.dry_run)\n', '  File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/dir_util.py", line 78, in mkpath\n    raise DistutilsFileError(\n']`', from /root/.ipykernel/3060/command-3851521498225777-2537282871, line 2066. Full traceback below:
Traceback (most recent call last):
  File "/root/.ipykernel/3060/command-3851521498225777-2537282871", line 2066, in elasticity_udf
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-cd1c15bf-7ad4-4846-a4ff-3beaa207767c/lib/python3.10/site-packages/stan/model.py", line 520, in build
    except KeyboardInterrupt:
  File "/databricks/python/lib/python3.10/site-packages/nest_asyncio.py", line 39, in run
    with suppress(asyncio.CancelledError):
  File "/databricks/python/lib/python3.10/site-packages/nest_asyncio.py", line 89, in run_until_complete
    return f.result()
  File "/usr/lib/python3.10/asyncio/futures.py", line 201, in result
    raise self._exception.with_traceback(self._exception_tb)
  File "/usr/lib/python3.10/asyncio/tasks.py", line 300, in __step
    self = None  # Needed to break cycles when an exception occurs.
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-cd1c15bf-7ad4-4846-a4ff-3beaa207767c/lib/python3.10/site-packages/stan/model.py", line 488, in go
    raise RuntimeError(resp.json()["message"])
RuntimeError: Exception while building model extension module: `DistutilsFileError("could not create 'build/temp.linux-x86_64-cpython-310/root/.cache/httpstan/4.12.0/models/yk53ta4w': Read-only file system")`, traceback: `['  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-cd1c15bf-7ad4-4846-a4ff-3beaa207767c/lib/python3.10/site-packages/httpstan/views.py", line 114, in handle_create_model\n    compiler_output = await httpstan.models.build_services_extension_module(program_code)\n', '  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-cd1c15bf-7ad4-4846-a4ff-3beaa207767c/lib/python3.10/site-packages/httpstan/models.py", line 172, in build_services_extension_module\n    compiler_output = await asyncio.get_running_loop().run_in_executor(\n', '  File "/usr/lib/python3.10/asyncio/futures.py", line 285, in __await__\n    yield self  # This tells Task to wait for completion.\n', '  File "/usr/lib/python3.10/asyncio/tasks.py", line 304, in __wakeup\n    future.result()\n', '  File "/usr/lib/python3.10/asyncio/futures.py", line 201, in result\n    raise self._exception.with_traceback(self._exception_tb)\n', '  File "/usr/lib/python3.10/concurrent/futures/thread.py", line 58, in run\n    result = self.fn(*self.args, **self.kwargs)\n', '  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-cd1c15bf-7ad4-4846-a4ff-3beaa207767c/lib/python3.10/site-packages/httpstan/build_ext.py", line 86, in run_build_ext\n    build_extension.run()\n', '  File "/databricks/python/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 79, in run\n    _build_ext.run(self)\n', '  File "/databricks/python/lib/python3.10/site-packages/Cython/Distutils/old_build_ext.py", line 186, in run\n    _build_ext.build_ext.run(self)\n', '  File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 346, in run\n    self.build_extensions()\n', '  File "/databricks/python/lib/python3.10/site-packages/Cython/Distutils/old_build_ext.py", line 195, in build_extensions\n    _build_ext.build_ext.build_extensions(self)\n', '  File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 466, in build_extensions\n    self._build_extensions_serial()\n', '  File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 492, in _build_extensions_serial\n    self.build_extension(ext)\n', '  File "/databricks/python/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 202, in build_extension\n    _build_ext.build_extension(self, ext)\n', '  File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 547, in build_extension\n    objects = self.compiler.compile(\n', '  File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/ccompiler.py", line 589, in compile\n    macros, objects, extra_postargs, pp_opts, build = self._setup_compile(\n', '  File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/ccompiler.py", line 360, in _setup_compile\n    self.mkpath(os.path.dirname(obj))\n', '  File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/ccompiler.py", line 993, in mkpath\n    mkpath(name, mode, dry_run=self.dry_run)\n', '  File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/dir_util.py", line 78, in mkpath\n    raise DistutilsFileError(\n']`

 

 

 

 

 

 

 

 

  

1 REPLY 1

Kaniz_Fatma
Community Manager
Community Manager

Hi @sara-alizaThis is a good start! To set it specifically inside the UDF, you can modify your UDF code to set the environment variable before invoking pystan3.

It’s possible that the workspace context provides different permissions or configurations. To investigate further, consider the following:

  • Check the permissions of the directories where the UDF runs. Ensure that the worker nodes have write access to the cache location.
  • Verify if there are any differences in environment variables or configurations between the workspace context and other contexts.

The error message you provided indicates a read-only file system issue during model extension module building. You might want to explore the following:

  • Check the file system permissions for the cache directory.
  • Look for any specific restrictions or limitations related to the Databricks runtime environment.

If you encounter any further issues or need additional assistance, feel free to ask! 😊

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!