What the maximum size to read using dbutils.fs.head

murtadha_s
Databricks Partner

Hi,

What the maximum size to read using dbutils.fs.head()?

is there a limit? because AI says 10MB and I couldn't find useful info in documentations, while I tried in the actual one and it was only limited by the driver memory.

Thanks in advance.

 

DivyaandData
Databricks Employee
Databricks Employee

dbutils.fs.head() itself does not have a documented hard cap like 10 MB.

From the official dbutils reference, the signature is:

dbutils.fs.head(file: String, max_bytes: int = 65536): String

“Returns up to the specified maximum number of bytes in the given file. The bytes are returned as a UTF-8 encoded string.”

So:

  • Default: If you don’t pass max_bytes, it returns up to 65,536 bytes (~64 KB).
  • Upper limit: Docs only say “max_bytes: int” and do not specify a fixed maximum. In practice the limit is whatever:
    • The driver can hold in memory, and
    • The notebook output UI can render (there’s a separate per-cell output cap, e.g. via %set_cell_max_output_size_in_mb with a range of 1–20 MB).

That’s why your experiments show it being “limited by the driver memory”: that’s effectively the real bound. The “10 MB” figure some AIs cite is likely confusing the notebook output limit with an intrinsic dbutils.fs.head limit, which isn’t documented.

Source - https://docs.databricks.com/aws/en/notebooks/notebooks-code , https://learn.microsoft.com/en-us/azure/databricks/dev-tools/databricks-utils

If this answers your question, please mark it as the accepted solution so others can find it more easily.