Databricks Community

Kaijser · ‎03-13-2023

I have encountered this issue for a while now and it happens each run that is triggered. I discovered 2 things:

1) If I run my script on a cluster that is not active and the cluster is activated by a scheduled trigger (not manually!) this doesn't happen and the logging works fine and is saved to the Azure blob storage.

2) The first scheduled trigger on a continuously running cluster gives a good log, any triggers after that result in just heaps of error messages for each log and the actual log file is not saved/updated after the first time it is triggered.

I use the code below to block out any additional logs polluting my logging:

logging.getLogger("py4j.java_gateway").setLevel(logging.ERROR)
logging.getLogger("py4j").setLevel(logging.ERROR)
logging.getLogger("pyspark").setLevel(logging.ERROR)
 
logging.basicConfig(
    level=logging.INFO,
    format="%(asctime)s [%(levelname)s]: %(message)s",
    datefmt="%Y-%m-%d %H:%M:%S%z",
    handlers=[
        logging.StreamHandler(),
        TimedRotatingFileHandler(
            filename= LOGGING_DIR / "test.log", 
            when="H", 
            interval=1, 
            backupCount=5, 
            encoding="utf-8"
        )
    ]
)
logger = logging.getLogger()

But every line that is logged results in an error message similar to the 1 below:

2023-03-13 19:46:23+0000 [INFO]: ---------------------------------------Running locally: False---------------------------------------
--- Logging error ---
Traceback (most recent call last):
  File "/usr/lib/python3.8/logging/__init__.py", line 1089, in emit
    self.flush()
  File "/usr/lib/python3.8/logging/__init__.py", line 1069, in flush
    self.stream.flush()
OSError: [Errno 95] Operation not supported
Call stack:
  File "/databricks/python_shell/scripts/PythonShell.py", line 29, in <module>
    launch_process()
  File "/databricks/python_shell/scripts/PythonShellImpl.py", line 1234, in launch_process
    shell.executor.run()
  File "/databricks/python_shell/scripts/PythonShellImpl.py", line 268, in run
    self.shell.shell.run_cell(command_id, cmd, store_history=True)
  File "/databricks/python_shell/scripts/PythonShellImpl.py", line 756, in run_cell
    super(IPythonShell, self).run_cell(raw_cell, store_history, silent, shell_futures)
  File "/databricks/python/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 2894, in run_cell
    result = self._run_cell(
  File "/databricks/python/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 2940, in _run_cell
    return runner(coro)
  File "/databricks/python/lib/python3.8/site-packages/IPython/core/async_helpers.py", line 68, in _pseudo_sync_runner
    coro.send(None)
  File "/databricks/python/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3165, in run_cell_async
    has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
  File "/databricks/python/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3357, in run_ast_nodes
    if (await self.run_code(code, result,  async_=asy)):
  File "/databricks/python/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3437, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<command-2200642500323014>", line 107, in <module>
    logger.info("{:-^100}".format(f"Running locally: {RUN_LOCAL}"))
Message: '---------------------------------------Running locally: False---------------------------------------'
Arguments: ()

In summary: the logs are polluted by these "--- Logging error ---" messages and the logs themselves aren't saved. Logging only works if the cluster is inactive and activated by a scheduled trigger and then gets deactivated again. On a continuously running cluster, only the first scheduled trigger results in normal logging and saves the file but any runs beyond that don't work.

If somebody could help me fix this, that would be great. I have some large projects running and without any logging my company will not be happy. The logging works when the module and scripts are run locally.

Anonymous · ‎03-14-2023

@Aaron Kaijser :

The error message you are seeing suggests that there is an issue with the file handler used to write your logs. Specifically, it looks like the stream handler used to write the logs to the console is fine, but the

TimedRotatingFileHandler is encountering an issue when attempting to write to the log file. One possible cause for this error is that the path to the log file is not accessible from the Databricks cluster. Check that the path to the log file is correct and that the Databricks cluster has permission to write to the specified location.

Another possibility is that the log file is being locked by another process, preventing the TimedRotatingFileHandler from writing to it. This can occur if another process has the log file open, for example. You can try modifying your logging configuration to use a different file handler, such as FileHandler, and see if the issue persists.

Finally, it's worth noting that writing to the file system is not a recommended way to do logging in Databricks. Databricks provides a logging API that you can use to log messages to the cluster's driver logs, which are stored in a central location and can be accessed from the Databricks UI. To use the Databricks logging API, you can replace the file handler in your logging configuration with a Databricks specific handler, such as databricks.logging.JsonFileHandler.

Anonymous · ‎03-18-2023

Hi @Aaron Kaijser

Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help.

We'd love to hear from you.

Thanks!