cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Error filtering by datetime Lakehouse Federated SQL Server table

ClintHall
New Contributor II
In unity catalog, I have a connection to a SQL Server database. When I try to filter by a datetime column using a datetime with fractional seconds, Databricks gives me this error:

Job aborted due to stage failure: com.microsoft.sqlserver.jdbc.SQLServerException: An error occurred during the current command (Done status 0). Conversion failed when converting date and/or time from character string.

On the databricks side, I use Azure Databricks, and I have tried this both using Serverless compute with Environment version 3 and with classic compute using 16.4 LTS (includes Apache Spark 3.5.2, Scala 2.12). On the SQL Server side, I have see it using both Enterprise Edition: Core-based Licensing (64-bit) and SQL Azure.

Here's a minimal case to replicate. In SQL Server:

create table test_datetimes (dt datetime);

INSERT INTO test_datetimes (dt) VALUES ('2025-07-17 14:33:21');
INSERT INTO test_datetimes (dt) VALUES ('2025-07-17 15:53:35');
INSERT INTO test_datetimes (dt) VALUES ('2025-07-17 16:36:33');
INSERT INTO test_datetimes (dt) VALUES ('2025-08-07 13:41:27');
INSERT INTO test_datetimes (dt) VALUES ('2025-08-07 15:41:51');
INSERT INTO test_datetimes (dt) VALUES ('2025-08-07 15:46:22');
INSERT INTO test_datetimes (dt) VALUES ('2025-08-14 11:07:32');
INSERT INTO test_datetimes (dt) VALUES ('2025-09-08 15:04:08');
INSERT INTO test_datetimes (dt) VALUES ('2025-09-08 20:57:18');
INSERT INTO test_datetimes (dt) VALUES ('2025-09-08 21:40:42');
INSERT INTO test_datetimes (dt) VALUES ('2025-09-08 22:24:11');
INSERT INTO test_datetimes (dt) VALUES ('2025-09-09 10:49:18');
INSERT INTO test_datetimes (dt) VALUES ('2025-09-09 11:18:32');
INSERT INTO test_datetimes (dt) VALUES ('2025-09-10 13:47:41');
INSERT INTO test_datetimes (dt) VALUES ('2025-09-29 15:59:32');
INSERT INTO test_datetimes (dt) VALUES ('2025-09-29 23:03:27');

In Databricks:

from pyspark.sql import functions as sf
from datetime import datetime

# No error here:
ts_no_fraction = datetime(2025, 9, 28, 13, 51, 37)
spark.table('my_sql_server_catalog.dbo.test_datetimes').filter(sf.col('dt') >= sf.lit(ts_no_fraction)).display()

# We get the error here:
ts_with_fraction = datetime(2025, 9, 28, 13, 51, 37, 10)
spark.table('my_sql_server_catalog.dbo.test_datetimes').filter(sf.col('dt') >= sf.lit(ts_with_fraction)).display()
1 ACCEPTED SOLUTION

Accepted Solutions

Isi
Honored Contributor II

Hello @ClintHall 

I believe you’re running into a mismatch between how Spark/Databricks generates the literal and what SQL Server datetime can store.

  • In SQL Server, the datetime type only supports milliseconds (3 decimal places). Docs 

    • When you pass a Python datetime without microseconds, Spark translates it into something like -> '2025-09-28 13:51:37.000'

  •  But if you pass a datetime with microseconds, e.g. datetime(2025, 9, 28, 13, 51, 37, 10) -> '2025-09-28 13:51:37.000010'

     

     

That’s 6 digits of fractional seconds, which SQL Server datetime cannot parse. When the filter is pushed down, SQL Server tries to convert that literal and fails with the conversion error you’re seeing.

This happens because of predicate pushdown: Spark translates your filter into SQL and sends it to the remote database, instead of filtering in Spark. This is usually good for performance (less data moved over the wire), but in your case it exposes the datatype mismatch.

 

 

  1. Check the actual SQL query being sent
    • Use EXPLAIN in Databricks to see the physical plan. Docs 

    • Or, if possible, capture the SQL query on the SQL Server side (e.g. via profiler or extended events) to confirm exactly how the literal looks.

  2. Workarounds

    • Truncate/round your Python datetime values to milliseconds before using them in filters, so Spark generates something SQL Server accepts.

    • If you have schema control, consider changing your column type to datetime2, which supports up to 7 fractional digits (microseconds). Docs

     

Hope this helps :),

Isi

 

View solution in original post

2 REPLIES 2

Isi
Honored Contributor II

Hello @ClintHall 

I believe you’re running into a mismatch between how Spark/Databricks generates the literal and what SQL Server datetime can store.

  • In SQL Server, the datetime type only supports milliseconds (3 decimal places). Docs 

    • When you pass a Python datetime without microseconds, Spark translates it into something like -> '2025-09-28 13:51:37.000'

  •  But if you pass a datetime with microseconds, e.g. datetime(2025, 9, 28, 13, 51, 37, 10) -> '2025-09-28 13:51:37.000010'

     

     

That’s 6 digits of fractional seconds, which SQL Server datetime cannot parse. When the filter is pushed down, SQL Server tries to convert that literal and fails with the conversion error you’re seeing.

This happens because of predicate pushdown: Spark translates your filter into SQL and sends it to the remote database, instead of filtering in Spark. This is usually good for performance (less data moved over the wire), but in your case it exposes the datatype mismatch.

 

 

  1. Check the actual SQL query being sent
    • Use EXPLAIN in Databricks to see the physical plan. Docs 

    • Or, if possible, capture the SQL query on the SQL Server side (e.g. via profiler or extended events) to confirm exactly how the literal looks.

  2. Workarounds

    • Truncate/round your Python datetime values to milliseconds before using them in filters, so Spark generates something SQL Server accepts.

    • If you have schema control, consider changing your column type to datetime2, which supports up to 7 fractional digits (microseconds). Docs

     

Hope this helps :),

Isi

 

ClintHall
New Contributor II

Thanks, @Isi. Very helpful. 

It would be nice if Lakehouse federation would do this for us (the same way that it knows SQL Server uses ISNULL where Spark SQl uses NVL). Is there a way to bring it the dev's attention?