cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

What the maximum size to read using dbutils.fs.head

murtadha_s
Databricks Partner

Hi,

What the maximum size to read using dbutils.fs.head()?

is there a limit? because AI says 10MB and I couldn't find useful info in documentations, while I tried in the actual one and it was only limited by the driver memory.

Thanks in advance.

 

1 REPLY 1

DivyaandData
Databricks Employee
Databricks Employee

dbutils.fs.head() itself does not have a documented hard cap like 10 MB.

From the official dbutils reference, the signature is:

dbutils.fs.head(file: String, max_bytes: int = 65536): String

โ€œReturns up to the specified maximum number of bytes in the given file. The bytes are returned as a UTF-8 encoded string.โ€

So:

  • Default: If you donโ€™t pass max_bytes, it returns up to 65,536 bytes (~64 KB).
  • Upper limit: Docs only say โ€œmax_bytes: intโ€ and do not specify a fixed maximum. In practice the limit is whatever:
    • The driver can hold in memory, and
    • The notebook output UI can render (thereโ€™s a separate per-cell output cap, e.g. via %set_cell_max_output_size_in_mb with a range of 1โ€“20 MB).

Thatโ€™s why your experiments show it being โ€œlimited by the driver memoryโ€: thatโ€™s effectively the real bound. The โ€œ10 MBโ€ figure some AIs cite is likely confusing the notebook output limit with an intrinsic dbutils.fs.head limit, which isnโ€™t documented.

Source - https://docs.databricks.com/aws/en/notebooks/notebooks-code , https://learn.microsoft.com/en-us/azure/databricks/dev-tools/databricks-utils

If this answers your question, please mark it as the accepted solution so others can find it more easily.