Pystan3 with databricks runtime 13.3
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-27-2024 11:42 AM
I am getting a read-only error when running pystan3 build() in a UDF.
I think the issue is related to the location that the code is being run from, which is read-only. I am looking to set a custom cache location inside the UDF. Based on this link I am able to change the cache_directory by setting
. But how can I set it inside the UDF to write it in a location that has the write access other than the worker node?
More context, I dont get this issue when running under workspace:
If I run the bellow python script under a repo I created here
Workspace->Repos->{user}->{cloned_repo} ,
it gives me error which I added. This is also the case when running python as a workflow job having the source as “Git Provider” and specifying the path to the notebook.
But if I run the python script under a workspace I created here
Workspace-> Workspace -> Users-> {user} ,
it works fine.
PythonException:
An exception was thrown from the Python worker. Please see the stack trace below.
'RuntimeError: Exception while building model extension module: `DistutilsFileError("could not create 'build/temp.linux-x86_64-cpython-310/root/.cache/httpstan/4.12.0/models/yk53ta4w': Read-only file system")`, traceback: `[' File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-cd1c15bf-7ad4-4846-a4ff-3beaa207767c/lib/python3.10/site-packages/httpstan/views.py", line 114, in handle_create_model\n compiler_output = await httpstan.models.build_services_extension_module(program_code)\n', ' File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-cd1c15bf-7ad4-4846-a4ff-3beaa207767c/lib/python3.10/site-packages/httpstan/models.py", line 172, in build_services_extension_module\n compiler_output = await asyncio.get_running_loop().run_in_executor(\n', ' File "/usr/lib/python3.10/asyncio/futures.py", line 285, in __await__\n yield self # This tells Task to wait for completion.\n', ' File "/usr/lib/python3.10/asyncio/tasks.py", line 304, in __wakeup\n future.result()\n', ' File "/usr/lib/python3.10/asyncio/futures.py", line 201, in result\n raise self._exception.with_traceback(self._exception_tb)\n', ' File "/usr/lib/python3.10/concurrent/futures/thread.py", line 58, in run\n result = self.fn(*self.args, **self.kwargs)\n', ' File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-cd1c15bf-7ad4-4846-a4ff-3beaa207767c/lib/python3.10/site-packages/httpstan/build_ext.py", line 86, in run_build_ext\n build_extension.run()\n', ' File "/databricks/python/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 79, in run\n _build_ext.run(self)\n', ' File "/databricks/python/lib/python3.10/site-packages/Cython/Distutils/old_build_ext.py", line 186, in run\n _build_ext.build_ext.run(self)\n', ' File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 346, in run\n self.build_extensions()\n', ' File "/databricks/python/lib/python3.10/site-packages/Cython/Distutils/old_build_ext.py", line 195, in build_extensions\n _build_ext.build_ext.build_extensions(self)\n', ' File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 466, in build_extensions\n self._build_extensions_serial()\n', ' File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 492, in _build_extensions_serial\n self.build_extension(ext)\n', ' File "/databricks/python/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 202, in build_extension\n _build_ext.build_extension(self, ext)\n', ' File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 547, in build_extension\n objects = self.compiler.compile(\n', ' File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/ccompiler.py", line 589, in compile\n macros, objects, extra_postargs, pp_opts, build = self._setup_compile(\n', ' File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/ccompiler.py", line 360, in _setup_compile\n self.mkpath(os.path.dirname(obj))\n', ' File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/ccompiler.py", line 993, in mkpath\n mkpath(name, mode, dry_run=self.dry_run)\n', ' File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/dir_util.py", line 78, in mkpath\n raise DistutilsFileError(\n']`', from /root/.ipykernel/3060/command-3851521498225777-2537282871, line 2066. Full traceback below:
Traceback (most recent call last):
File "/root/.ipykernel/3060/command-3851521498225777-2537282871", line 2066, in elasticity_udf
File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-cd1c15bf-7ad4-4846-a4ff-3beaa207767c/lib/python3.10/site-packages/stan/model.py", line 520, in build
except KeyboardInterrupt:
File "/databricks/python/lib/python3.10/site-packages/nest_asyncio.py", line 39, in run
with suppress(asyncio.CancelledError):
File "/databricks/python/lib/python3.10/site-packages/nest_asyncio.py", line 89, in run_until_complete
return f.result()
File "/usr/lib/python3.10/asyncio/futures.py", line 201, in result
raise self._exception.with_traceback(self._exception_tb)
File "/usr/lib/python3.10/asyncio/tasks.py", line 300, in __step
self = None # Needed to break cycles when an exception occurs.
File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-cd1c15bf-7ad4-4846-a4ff-3beaa207767c/lib/python3.10/site-packages/stan/model.py", line 488, in go
raise RuntimeError(resp.json()["message"])
RuntimeError: Exception while building model extension module: `DistutilsFileError("could not create 'build/temp.linux-x86_64-cpython-310/root/.cache/httpstan/4.12.0/models/yk53ta4w': Read-only file system")`, traceback: `[' File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-cd1c15bf-7ad4-4846-a4ff-3beaa207767c/lib/python3.10/site-packages/httpstan/views.py", line 114, in handle_create_model\n compiler_output = await httpstan.models.build_services_extension_module(program_code)\n', ' File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-cd1c15bf-7ad4-4846-a4ff-3beaa207767c/lib/python3.10/site-packages/httpstan/models.py", line 172, in build_services_extension_module\n compiler_output = await asyncio.get_running_loop().run_in_executor(\n', ' File "/usr/lib/python3.10/asyncio/futures.py", line 285, in __await__\n yield self # This tells Task to wait for completion.\n', ' File "/usr/lib/python3.10/asyncio/tasks.py", line 304, in __wakeup\n future.result()\n', ' File "/usr/lib/python3.10/asyncio/futures.py", line 201, in result\n raise self._exception.with_traceback(self._exception_tb)\n', ' File "/usr/lib/python3.10/concurrent/futures/thread.py", line 58, in run\n result = self.fn(*self.args, **self.kwargs)\n', ' File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-cd1c15bf-7ad4-4846-a4ff-3beaa207767c/lib/python3.10/site-packages/httpstan/build_ext.py", line 86, in run_build_ext\n build_extension.run()\n', ' File "/databricks/python/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 79, in run\n _build_ext.run(self)\n', ' File "/databricks/python/lib/python3.10/site-packages/Cython/Distutils/old_build_ext.py", line 186, in run\n _build_ext.build_ext.run(self)\n', ' File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 346, in run\n self.build_extensions()\n', ' File "/databricks/python/lib/python3.10/site-packages/Cython/Distutils/old_build_ext.py", line 195, in build_extensions\n _build_ext.build_ext.build_extensions(self)\n', ' File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 466, in build_extensions\n self._build_extensions_serial()\n', ' File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 492, in _build_extensions_serial\n self.build_extension(ext)\n', ' File "/databricks/python/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 202, in build_extension\n _build_ext.build_extension(self, ext)\n', ' File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 547, in build_extension\n objects = self.compiler.compile(\n', ' File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/ccompiler.py", line 589, in compile\n macros, objects, extra_postargs, pp_opts, build = self._setup_compile(\n', ' File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/ccompiler.py", line 360, in _setup_compile\n self.mkpath(os.path.dirname(obj))\n', ' File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/ccompiler.py", line 993, in mkpath\n mkpath(name, mode, dry_run=self.dry_run)\n', ' File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/dir_util.py", line 78, in mkpath\n raise DistutilsFileError(\n']`
---------------------------------------------------------------------------
PythonException Traceback (most recent call last)
File <command-2300127166286607>, line 8
3 df_out = df_input_small.groupby("Exec_Id").applyInPandas(
4 elasticity_udf,
5 schema="Store_Physical_Location_No string, Exec_Id string, Item_No string",
6 )
7 # DOES NOT WORK
----> 8 df_out.count()
File /databricks/spark/python/pyspark/instrumentation_utils.py:48, in _wrap_function.<locals>.wrapper(*args, **kwargs)
46 start = time.perf_counter()
47 try:
---> 48 res = func(*args, **kwargs)
49 logger.log_success(
50 module_name, class_name, function_name, time.perf_counter() - start, signature
51 )
52 return res
File /databricks/spark/python/pyspark/sql/dataframe.py:1206, in DataFrame.count(self)
1183 def count(self) -> int:
1184 """Returns the number of rows in this :class:`DataFrame`.
1185
1186 .. versionadded:: 1.3.0
(...)
1204 3
1205 """
-> 1206 return int(self._jdf.count())
File /databricks/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/java_gateway.py:1355, in JavaMember.__call__(self, *args)
1349 command = proto.CALL_COMMAND_NAME +\
1350 self.command_header +\
1351 args_command +\
1352 proto.END_COMMAND_PART
1354 answer = self.gateway_client.send_command(command)
-> 1355 return_value = get_return_value(
1356 answer, self.gateway_client, self.target_id, self.name)
1358 for temp_arg in temp_args:
1359 if hasattr(temp_arg, "_detach"):
File /databricks/spark/python/pyspark/errors/exceptions/captured.py:194, in capture_sql_exception.<locals>.deco(*a, **kw)
190 converted = convert_exception(e.java_exception)
191 if not isinstance(converted, UnknownException):
192 # Hide where the exception came from that shows a non-Pythonic
193 # JVM exception message.
--> 194 raise converted from None
195 else:
196 raise
PythonException:
An exception was thrown from the Python worker. Please see the stack trace below.
'RuntimeError: Exception while building model extension module: `DistutilsFileError("could not create 'build/temp.linux-x86_64-cpython-310/root/.cache/httpstan/4.12.0/models/yk53ta4w': Read-only file system")`, traceback: `[' File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-cd1c15bf-7ad4-4846-a4ff-3beaa207767c/lib/python3.10/site-packages/httpstan/views.py", line 114, in handle_create_model\n compiler_output = await httpstan.models.build_services_extension_module(program_code)\n', ' File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-cd1c15bf-7ad4-4846-a4ff-3beaa207767c/lib/python3.10/site-packages/httpstan/models.py", line 172, in build_services_extension_module\n compiler_output = await asyncio.get_running_loop().run_in_executor(\n', ' File "/usr/lib/python3.10/asyncio/futures.py", line 285, in __await__\n yield self # This tells Task to wait for completion.\n', ' File "/usr/lib/python3.10/asyncio/tasks.py", line 304, in __wakeup\n future.result()\n', ' File "/usr/lib/python3.10/asyncio/futures.py", line 201, in result\n raise self._exception.with_traceback(self._exception_tb)\n', ' File "/usr/lib/python3.10/concurrent/futures/thread.py", line 58, in run\n result = self.fn(*self.args, **self.kwargs)\n', ' File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-cd1c15bf-7ad4-4846-a4ff-3beaa207767c/lib/python3.10/site-packages/httpstan/build_ext.py", line 86, in run_build_ext\n build_extension.run()\n', ' File "/databricks/python/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 79, in run\n _build_ext.run(self)\n', ' File "/databricks/python/lib/python3.10/site-packages/Cython/Distutils/old_build_ext.py", line 186, in run\n _build_ext.build_ext.run(self)\n', ' File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 346, in run\n self.build_extensions()\n', ' File "/databricks/python/lib/python3.10/site-packages/Cython/Distutils/old_build_ext.py", line 195, in build_extensions\n _build_ext.build_ext.build_extensions(self)\n', ' File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 466, in build_extensions\n self._build_extensions_serial()\n', ' File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 492, in _build_extensions_serial\n self.build_extension(ext)\n', ' File "/databricks/python/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 202, in build_extension\n _build_ext.build_extension(self, ext)\n', ' File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 547, in build_extension\n objects = self.compiler.compile(\n', ' File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/ccompiler.py", line 589, in compile\n macros, objects, extra_postargs, pp_opts, build = self._setup_compile(\n', ' File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/ccompiler.py", line 360, in _setup_compile\n self.mkpath(os.path.dirname(obj))\n', ' File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/ccompiler.py", line 993, in mkpath\n mkpath(name, mode, dry_run=self.dry_run)\n', ' File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/dir_util.py", line 78, in mkpath\n raise DistutilsFileError(\n']`', from /root/.ipykernel/3060/command-3851521498225777-2537282871, line 2066. Full traceback below:
Traceback (most recent call last):
File "/root/.ipykernel/3060/command-3851521498225777-2537282871", line 2066, in elasticity_udf
File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-cd1c15bf-7ad4-4846-a4ff-3beaa207767c/lib/python3.10/site-packages/stan/model.py", line 520, in build
except KeyboardInterrupt:
File "/databricks/python/lib/python3.10/site-packages/nest_asyncio.py", line 39, in run
with suppress(asyncio.CancelledError):
File "/databricks/python/lib/python3.10/site-packages/nest_asyncio.py", line 89, in run_until_complete
return f.result()
File "/usr/lib/python3.10/asyncio/futures.py", line 201, in result
raise self._exception.with_traceback(self._exception_tb)
File "/usr/lib/python3.10/asyncio/tasks.py", line 300, in __step
self = None # Needed to break cycles when an exception occurs.
File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-cd1c15bf-7ad4-4846-a4ff-3beaa207767c/lib/python3.10/site-packages/stan/model.py", line 488, in go
raise RuntimeError(resp.json()["message"])
RuntimeError: Exception while building model extension module: `DistutilsFileError("could not create 'build/temp.linux-x86_64-cpython-310/root/.cache/httpstan/4.12.0/models/yk53ta4w': Read-only file system")`, traceback: `[' File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-cd1c15bf-7ad4-4846-a4ff-3beaa207767c/lib/python3.10/site-packages/httpstan/views.py", line 114, in handle_create_model\n compiler_output = await httpstan.models.build_services_extension_module(program_code)\n', ' File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-cd1c15bf-7ad4-4846-a4ff-3beaa207767c/lib/python3.10/site-packages/httpstan/models.py", line 172, in build_services_extension_module\n compiler_output = await asyncio.get_running_loop().run_in_executor(\n', ' File "/usr/lib/python3.10/asyncio/futures.py", line 285, in __await__\n yield self # This tells Task to wait for completion.\n', ' File "/usr/lib/python3.10/asyncio/tasks.py", line 304, in __wakeup\n future.result()\n', ' File "/usr/lib/python3.10/asyncio/futures.py", line 201, in result\n raise self._exception.with_traceback(self._exception_tb)\n', ' File "/usr/lib/python3.10/concurrent/futures/thread.py", line 58, in run\n result = self.fn(*self.args, **self.kwargs)\n', ' File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-cd1c15bf-7ad4-4846-a4ff-3beaa207767c/lib/python3.10/site-packages/httpstan/build_ext.py", line 86, in run_build_ext\n build_extension.run()\n', ' File "/databricks/python/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 79, in run\n _build_ext.run(self)\n', ' File "/databricks/python/lib/python3.10/site-packages/Cython/Distutils/old_build_ext.py", line 186, in run\n _build_ext.build_ext.run(self)\n', ' File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 346, in run\n self.build_extensions()\n', ' File "/databricks/python/lib/python3.10/site-packages/Cython/Distutils/old_build_ext.py", line 195, in build_extensions\n _build_ext.build_ext.build_extensions(self)\n', ' File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 466, in build_extensions\n self._build_extensions_serial()\n', ' File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 492, in _build_extensions_serial\n self.build_extension(ext)\n', ' File "/databricks/python/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 202, in build_extension\n _build_ext.build_extension(self, ext)\n', ' File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 547, in build_extension\n objects = self.compiler.compile(\n', ' File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/ccompiler.py", line 589, in compile\n macros, objects, extra_postargs, pp_opts, build = self._setup_compile(\n', ' File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/ccompiler.py", line 360, in _setup_compile\n self.mkpath(os.path.dirname(obj))\n', ' File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/ccompiler.py", line 993, in mkpath\n mkpath(name, mode, dry_run=self.dry_run)\n', ' File "/databricks/python/lib/python3.10/site-packages/setuptools/_distutils/dir_util.py", line 78, in mkpath\n raise DistutilsFileError(\n']`

