cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

federated queries on PostgreSQL - TimestampNTZ option

thomas_berry
New Contributor II

Hello,

I am trying to migrate some spark reads away from JDBC into the federated queries based in unity catalog.

Here is an example of the spark read command that I want to migrate:

spark.read.format("jdbc").option("driver", "org.postgresql.Driver").option("url", url).option("user", user).option("password", password).option("query", query).option("preferTimestampNTZ", True).load()That "preferTimestampNTZ" option is really important because I have many table columns in the source that timestamps without timezone and are not in UTC. 
Problem is that I cannot find in the documentation on how to set up the catalog to leverage this option.

The closest that I can find is at the bottom of this link here: Run federated queries on PostgreSQL | Databricks Documentation but that doesn't explain how that option can be applied.

Here is an example of how I want the future queries to look like:

spark.table("postgres_catalog.postgres_database.postgres_table")

Does anyone know?

Thank You

1 ACCEPTED SOLUTION

Accepted Solutions

lingareddy_Alva
Honored Contributor II

Hi @thomas_berry 

It's a common challenge when transitioning from Spark JDBC reads to Unity Catalog federated tables, especially when dealing with timestamp types.
This option is specific to the JDBC data source and tells Spark to treat timestamp columns without timezone as TIMESTAMP_NTZ (no time zone) instead of defaulting to TIMESTAMP (which is UTC by default).

Federated Tables in Unity Catalog
When you migrate to Unity Catalog federated tables, the underlying mechanics change:
-- You're no longer using spark.read.format("jdbc") — Databricks uses SQL pushdown via connectors in foreign catalogs.
-- These are managed metadata tables that automatically sync schemas and push down query execution to the external system (PostgreSQL in your case).
-- Because you're querying them via spark.table("...") or SQL (SELECT * FROM ...), you're using Databricks SQL dialect, not raw JDBC options.

No Option for preferTimestampNTZ:
Currently, Unity Catalog does not expose a documented setting like preferTimestampNTZ=True for federated PostgreSQL connections.
Instead, Databricks does automatic type inference, and by default:
-- PostgreSQL’s timestamp without time zone is mapped to TIMESTAMP (UTC) in Databricks.
-- If your data is not in UTC, this mapping can silently shift your values unless you apply manual conversion logic (e.g., using from_utc_timestamp()).

Workarounds:
1. Manually adjust time zones in queries
If your timestamps are in a known local timezone (say "America/New_York"), you can do:

In SQL:
SELECT from_utc_timestamp(your_column, 'America/New_York') FROM postgres_catalog.db.table

Or in PySpark:
from pyspark.sql.functions import from_utc_timestamp
df = spark.table("postgres_catalog.db.table").withColumn("local_ts", from_utc_timestamp("your_column", "America/New_York"))

2. Use Views to Abstract the Logic
If this transformation is needed in many places, create views in Unity Catalog that do this adjustment:

CREATE VIEW postgres_catalog.db.adjusted_table AS
SELECT from_utc_timestamp(ts_col, 'America/New_York') as ts_col_local, ...
FROM postgres_catalog.db.raw_table

3. Keep JDBC reads where necessary
If your timestamp logic is critical and cannot be centralized via views, you may need to keep JDBC reads for
certain use cases until Databricks exposes a control for this behavior in Unity Catalog.

 

Federated tables abstract away many low-level controls (like JDBC options) to simplify cross-source querying
— but that means some fine-grained behaviors are lost, especially those related to timezone-sensitive data types.
Until Databricks adds an equivalent of preferTimestampNTZ to foreign catalog configurations,
you'll need to handle these cases explicitly.

 

LR

View solution in original post

3 REPLIES 3

lingareddy_Alva
Honored Contributor II

Hi @thomas_berry 

It's a common challenge when transitioning from Spark JDBC reads to Unity Catalog federated tables, especially when dealing with timestamp types.
This option is specific to the JDBC data source and tells Spark to treat timestamp columns without timezone as TIMESTAMP_NTZ (no time zone) instead of defaulting to TIMESTAMP (which is UTC by default).

Federated Tables in Unity Catalog
When you migrate to Unity Catalog federated tables, the underlying mechanics change:
-- You're no longer using spark.read.format("jdbc") — Databricks uses SQL pushdown via connectors in foreign catalogs.
-- These are managed metadata tables that automatically sync schemas and push down query execution to the external system (PostgreSQL in your case).
-- Because you're querying them via spark.table("...") or SQL (SELECT * FROM ...), you're using Databricks SQL dialect, not raw JDBC options.

No Option for preferTimestampNTZ:
Currently, Unity Catalog does not expose a documented setting like preferTimestampNTZ=True for federated PostgreSQL connections.
Instead, Databricks does automatic type inference, and by default:
-- PostgreSQL’s timestamp without time zone is mapped to TIMESTAMP (UTC) in Databricks.
-- If your data is not in UTC, this mapping can silently shift your values unless you apply manual conversion logic (e.g., using from_utc_timestamp()).

Workarounds:
1. Manually adjust time zones in queries
If your timestamps are in a known local timezone (say "America/New_York"), you can do:

In SQL:
SELECT from_utc_timestamp(your_column, 'America/New_York') FROM postgres_catalog.db.table

Or in PySpark:
from pyspark.sql.functions import from_utc_timestamp
df = spark.table("postgres_catalog.db.table").withColumn("local_ts", from_utc_timestamp("your_column", "America/New_York"))

2. Use Views to Abstract the Logic
If this transformation is needed in many places, create views in Unity Catalog that do this adjustment:

CREATE VIEW postgres_catalog.db.adjusted_table AS
SELECT from_utc_timestamp(ts_col, 'America/New_York') as ts_col_local, ...
FROM postgres_catalog.db.raw_table

3. Keep JDBC reads where necessary
If your timestamp logic is critical and cannot be centralized via views, you may need to keep JDBC reads for
certain use cases until Databricks exposes a control for this behavior in Unity Catalog.

 

Federated tables abstract away many low-level controls (like JDBC options) to simplify cross-source querying
— but that means some fine-grained behaviors are lost, especially those related to timezone-sensitive data types.
Until Databricks adds an equivalent of preferTimestampNTZ to foreign catalog configurations,
you'll need to handle these cases explicitly.

 

LR

thomas_berry
New Contributor II

Thank you for your reply. I will mark your answer as accepted solution but I hope that in the future this options, as you explained, gets added!

lingareddy_Alva
Honored Contributor II

Thanks @thomas_berry I hope so 😀

LR

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now