- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-11-2025 02:47 AM
The error message you are observing in your DLT pipeline logs, specifically:
java.lang.NumberFormatException: For input string: "Fri, 29 Aug 2025 09:02:07 GMT"
suggests that something in your pipeline (likely library or code responsible for Azure Data Lake Gen2 (ADL Gen2) operations) is attempting to parse a date string as a numeric value, such as a timestamp or epoch time, and failing.
Root Cause
-
The error originates from the NativeADLGen2RequestComparisonHandler, part of the (likely Databricks/Spark) library that talks to Azure Data Lake Gen2.
-
The handler is expecting a numeric value (usually, a Unix timestamp, e.g., 1693296000), but it's receiving a formatted date string, e.g., "Fri, 29 Aug 2025 09:02:07 GMT".
Why is this happening now?
-
Library Update or Backend Change: The format of the value returned (or logged) may have changed either due to a code/library update or a backend change on Microsoft/Azure's side.
-
Misconfigured Pipeline or Upstream Data Issue: If any feature in your pipeline switches format or passes metadata with invalid types, it can also cause this type of error.
-
External API/Response Change: If ADL Gen2 or some middleware changed how it formats headers or metadata (for instance, Last-Modified or similar fields), this could result in the current code being unable to handle the new format.
Why execution is unaffected
-
This appears to be a logging or comparison-related issue, where the function is intended for debug/logging or non-essential request validation. It catches and logs the error but does not bubble it up or halt processing.
-
The error might occur after streaming "triggers" or update cycles, explaining the high frequency.
How to Fix or Mitigate
Immediate Workarounds:
-
Since the error doesn't break functionality, you may continue unaffected, though frequent logging can obscure real issues or fill up logs quickly.
-
If possible, reduce the log level for this handler in your log4j configuration to avoid clutter in your logs.
Long-term Solutions:
-
Check for library updates: Make sure your Databricks, Spark, or any custom connector libraries for ADL Gen2 are up to date. Recent versions may have patched this issue if it’s a known bug.
-
Raise a support ticket: If using a managed service like Databricks, raise a ticket with them, quoting the handler name and error. They may have knowledge of recent changes.
-
Check pipeline config and metadata: Make sure that all fields, especially those involving timestamps or modification dates, are passed in the correct expected format.
-
Review release notes for Spark, Databricks Runtime, and Azure ADLS SDKs for any breaking changes related to date/time handling in the past few months.
Additional Notes
-
If you're using custom code/logic for ADLS file interactions, audit any places where you serialize or deserialize timestamps.
-
If this is strictly happening after certain DLT operations, consider temporarily disabling streaming tasks or checkpointing to see if the error stops.
This is a known class of error during changes in serialization/deserialization of metadata fields across cloud storage SDKs. Ensuring version compatibility and reporting to your cloud provider can help resolve it at the root if it's a backend or SDK bug.