I have encountered this issue for a while now and it happens each run that is triggered. I discovered 2 things:
1) If I run my script on a cluster that is not active and the cluster is activated by a scheduled trigger (not manually!) this doesn't happen and the logging works fine and is saved to the Azure blob storage.
2) The first scheduled trigger on a continuously running cluster gives a good log, any triggers after that result in just heaps of error messages for each log and the actual log file is not saved/updated after the first time it is triggered.
I use the code below to block out any additional logs polluting my logging:
logging.getLogger("py4j.java_gateway").setLevel(logging.ERROR)
logging.getLogger("py4j").setLevel(logging.ERROR)
logging.getLogger("pyspark").setLevel(logging.ERROR)
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s [%(levelname)s]: %(message)s",
datefmt="%Y-%m-%d %H:%M:%S%z",
handlers=[
logging.StreamHandler(),
TimedRotatingFileHandler(
filename= LOGGING_DIR / "test.log",
when="H",
interval=1,
backupCount=5,
encoding="utf-8"
)
]
)
logger = logging.getLogger()
But every line that is logged results in an error message similar to the 1 below:
2023-03-13 19:46:23+0000 [INFO]: ---------------------------------------Running locally: False---------------------------------------
--- Logging error ---
Traceback (most recent call last):
File "/usr/lib/python3.8/logging/__init__.py", line 1089, in emit
self.flush()
File "/usr/lib/python3.8/logging/__init__.py", line 1069, in flush
self.stream.flush()
OSError: [Errno 95] Operation not supported
Call stack:
File "/databricks/python_shell/scripts/PythonShell.py", line 29, in <module>
launch_process()
File "/databricks/python_shell/scripts/PythonShellImpl.py", line 1234, in launch_process
shell.executor.run()
File "/databricks/python_shell/scripts/PythonShellImpl.py", line 268, in run
self.shell.shell.run_cell(command_id, cmd, store_history=True)
File "/databricks/python_shell/scripts/PythonShellImpl.py", line 756, in run_cell
super(IPythonShell, self).run_cell(raw_cell, store_history, silent, shell_futures)
File "/databricks/python/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 2894, in run_cell
result = self._run_cell(
File "/databricks/python/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 2940, in _run_cell
return runner(coro)
File "/databricks/python/lib/python3.8/site-packages/IPython/core/async_helpers.py", line 68, in _pseudo_sync_runner
coro.send(None)
File "/databricks/python/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3165, in run_cell_async
has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
File "/databricks/python/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3357, in run_ast_nodes
if (await self.run_code(code, result, async_=asy)):
File "/databricks/python/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3437, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<command-2200642500323014>", line 107, in <module>
logger.info("{:-^100}".format(f"Running locally: {RUN_LOCAL}"))
Message: '---------------------------------------Running locally: False---------------------------------------'
Arguments: ()
In summary: the logs are polluted by these "--- Logging error ---" messages and the logs themselves aren't saved. Logging only works if the cluster is inactive and activated by a scheduled trigger and then gets deactivated again. On a continuously running cluster, only the first scheduled trigger results in normal logging and saves the file but any runs beyond that don't work.
If somebody could help me fix this, that would be great. I have some large projects running and without any logging my company will not be happy. The logging works when the module and scripts are run locally.