- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-03-2025 11:19 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-03-2025 05:27 PM
Hello @ClintHall
I believe you’re running into a mismatch between how Spark/Databricks generates the literal and what SQL Server datetime can store.
In SQL Server, the datetime type only supports milliseconds (3 decimal places). Docs
When you pass a Python datetime without microseconds, Spark translates it into something like -> '2025-09-28 13:51:37.000'
But if you pass a datetime with microseconds, e.g. datetime(2025, 9, 28, 13, 51, 37, 10) -> '2025-09-28 13:51:37.000010'
That’s 6 digits of fractional seconds, which SQL Server datetime cannot parse. When the filter is pushed down, SQL Server tries to convert that literal and fails with the conversion error you’re seeing.
This happens because of predicate pushdown: Spark translates your filter into SQL and sends it to the remote database, instead of filtering in Spark. This is usually good for performance (less data moved over the wire), but in your case it exposes the datatype mismatch.
- Check the actual SQL query being sent
Use EXPLAIN in Databricks to see the physical plan. Docs
Or, if possible, capture the SQL query on the SQL Server side (e.g. via profiler or extended events) to confirm exactly how the literal looks.
Workarounds
Truncate/round your Python datetime values to milliseconds before using them in filters, so Spark generates something SQL Server accepts.
If you have schema control, consider changing your column type to datetime2, which supports up to 7 fractional digits (microseconds). Docs
Hope this helps :),
Isi
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-06-2025 09:47 AM
Thanks, @Isi. Very helpful.
It would be nice if Lakehouse federation would do this for us (the same way that it knows SQL Server uses ISNULL where Spark SQl uses NVL). Is there a way to bring it the dev's attention?