10-28-2019 07:49 AM
My organization has an S3 bucket mounted to the databricks filesystem under
/dbfs/mnt
. When using Databricks runtime 5.5 and below, the following logging code works correctly:log_file = '/dbfs/mnt/path/to/my/bucket/test.log' logger = logging.getLogger('test-logger') logger.setLevel(logging.INFO) handler = logging.FileHandler(str(log_file)) handler.setLevel(logging.INFO) logger.addHandler(handler)
logger.info('test')
After upgrading to Databricks runtime 6.1, the above code produces a logging error "OSError: [Errno 95] Operation not supported". Here's the stack trace that is printed:
Traceback (most recent call last):
File "/databricks/python/lib/python3.7/logging/__init__.py", line 1038, in emit
self.flush()
File "/databricks/python/lib/python3.7/logging/__init__.py", line 1018, in flush
self.stream.flush()
OSError: [Errno 95] Operation not supported
The strange thing is that regular Python file I/O works fine with the same file. (i.e. I can
open()
and write()
to that filepath successfully.) Any idea what's going on?
11-21-2019 08:18 AM
According to the limitations in the Docs, starting from runtime 6.0, random writes to dbfs are no longer supported.
11-21-2019 06:53 AM
> The strange thing is that regular Python file I/O works fine with the same file. (i.e. I can open() and write() to that filepath successfully.) Any idea what's going on?
If you try to do a further operation after opening the file it will throw the same error. I'm also experiencing the same issue.11-21-2019 08:18 AM
According to the limitations in the Docs, starting from runtime 6.0, random writes to dbfs are no longer supported.
09-09-2020 10:06 PM
Probably it's worth to try to rewrite the emit ... https://docs.python.org/3/library/logging.html#handlers
This works for me:
class OurFileHandler(logging.FileHandler): def emit(self, record):
# copied from https://github.com/python/cpython/blob/master/Lib/logging/__init__.p
if self.stream is None:
self.stream = self._open()
try:
msg = self.format(record)
stream = self.stream
# issue 35046: merged two stream.writes into one.
stream.write(msg + self.terminator)
self.flush()
except RecursionError: # See issue 36272
raise
except Exception:
self.handleError(record)
# logger must be defined
ch = logging.FileHandler(log_file_path)
ch = OurFileHandler(log_file_path)
formatter = logging.Formatter(
'%(asctime)s: %(name)s (%(levelname)s): %(message)s'
)
ch.setFormatter(formatter)
logger.addHandler(ch)
Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections.
Click here to register and join today!
Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.