cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Disable Logging inPython `dbutils.fs.put`?

ToBeDataDriven
New Contributor

This function logs every time it writes to stdout "Wrote n bytes." I want to disable its logging as I have thousands of files I'm writing and it floods the log with meaningless information. Does anyone know if it's possible?

4 REPLIES 4

K_Anudeep
Databricks Employee
Databricks Employee

Hello @ToBeDataDriven ,

If it's a notebook cell, you can silence the output of the cell by using %%capture 

%%capture
dbutils.fs.put("dbfs:/FileStore/anudeep/datasets/word_count/tmp/3.txt","foo")

 

If you want to do it in your code, then there is no flag to silence the output, but you can use a wrapper function in Python as a workaround.

Code:

import os
from contextlib import redirect_stdout, redirect_stderr
from io import StringIO

def dbutils_put(path: str, contents: str, overwrite: bool = True) -> None:
    sink = StringIO()
    with redirect_stdout(sink), redirect_stderr(sink):
        dbutils.fs.put(path, contents, overwrite)

dbutils_put("dbfs:/FileStore/anudeep/streaming_datasets/word_count/tmp/4.txt","abc")

 

Please let me know if this works for you. Thanks

 

ToBeDataDriven
New Contributor

Thanks for the reply!

Would that code solution be per thread or would it only work on stdout globally?

I basically have a thread pool doing the work, and I need all the output except fs.put. If it disables it globally I may lose output while it's performing the put.

 

K_Anudeep
Databricks Employee
Databricks Employee

@ToBeDataDriven ,
redirect_stdout is process-global, not per-thread.
It temporarily rebinds sys.stdout for the whole interpreter, so in a thread pool, you can accidentally swallow other threadsโ€™ prints while the with block is active. Itโ€™s not thread-safe for your use case. If you need all the output except fs.put, then you need to isolate it; otherwise, I dont think there is another option.

 

K_Anudeep
Databricks Employee
Databricks Employee

@ToBeDataDriven , If the above answered your question, then could you please help accept the solution?

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local communityโ€”sign up today to get started!

Sign Up Now