Databricks Community

MDAtl · 3 weeks ago

So all our on-prem datasources have timezones set to local (US East) and all timestamps stored in the table are in US East. As we're ETLing them to Databricks we're noticing that the same timestamp is now stored in UTC. So what was 6am in EST is now 6AM in UTC which is 2AM EST. So we'd like for 6AM EST in our source system be stored as 6AM in EST (or 10 AM UTC) in our Delta lake tables.

Is there a way to set the default timezone in Databricks at a workspace level ? All that I have seen so far shows how to set them at session or compute level. We'd really like to avoid setting it at compute or session level.

Ashwin_DSA · 3 weeks ago

Hi @MDAtl,

Thanks for following up. I did some research on this and couldn't find any publicly documented rationale for why Databricks doesn't expose a workspace-wide default timezone setting across all compute.

I did, however, find similar community discussions asking for related capabilities, so this does not seem to be an isolated requirement.

Given the scale you described and the risk of inconsistent handling across classic and serverless pipelines, this may be worth raising as a feature request through your Databricks account team. I'll also raise it internally on my end, but I want to set expectations clearly that this is not a definite commitment to implementation. Prioritisation typically depends on overall customer demand, business impact, and roadmap considerations.

According to the public documentation today, the supported model is that Databricks SQL can have broader SQL warehouse-level settings via the TIMEZONE parameter, whereas notebooks and jobs rely on session- or compute-scoped settings, and serverless documents use spark.sql.session.timeZone with a default of Etc/UTC.

If preserving local wall-clock time is the main requirement, one option worth evaluating is TIMESTAMP_NTZ, since standard TIMESTAMP values represent an absolute instant and are persisted in UTC. This is still Public Preview though.

Like I said before... even if you standardise the approach going forward, it will not retroactively fix data that has already been ingested with the wrong timezone interpretation. Existing data would need to be reprocessed or backfilled if you want those stored values corrected.

Hope this helps.

If this answer resolves your question, could you mark it as “Accept as Solution”? That helps other users quickly find the correct fix.

Regards,
Ashwin | Delivery Solution Architect @ Databricks
Helping you build and scale the Data Intelligence Platform.
***Opinions are my own***

View solution in original post

balajij8 · 3 weeks ago

You can follow certain ways to get the type of effect as there is no a single configuration for it.

SQL Warehouses - You can set TIMEZONE parameter (globally) for the warehouse using SQL configuration parameters in the warehouse settings leading to all sessions on the warehouse inheriting this time zone by default
Compute - You can set spark.sql.session.timeZone in Cluster configuration (Spark config) and use policies to enforce across multiple clusters

Setting time zone configuration affects how timestamps are displayed and manipulated, but it doesn't fix timestamps that are already stored incorrectly. You can handle it during ingestion by using TIMESTAMP_NTZ columns in Delta tables to preserve wall clock time without time zone interpretation or using to_utc_timestamp() during ingestion to properly convert EST to UTC

Ashwin_DSA · 3 weeks ago

Hi @MDAtl,

The short answer is no workspace-wide timezone setting for all Databricks compute.

The long answer is that... this is expected TIMESTAMP behaviour. The timestamp values represent an absolute point in time, are normalised and persisted in UTC, and the session time zone is applied when values are displayed, or date-time fields are extracted.

There is no documented workspace-wide setting that changes this. For Databricks SQL, the TIMEZONE parameter can be set at the session level and also globally for SQL warehouses by using SQL configuration parameters or the SQL Warehouse API, and the documented system default is UTC. For SQL warehouse admin settings, the workspace-level SQL parameter setting applies to all SQL warehouses in the workspace, and changing it restarts running SQL warehouses.

For notebooks and jobs, the public guidance is still session or compute-scoped rather than workspace-scoped. The Spark configuration docs show that notebook-level settings affect only the current SparkSession, compute-level settings apply to workloads on that compute, and for serverless notebooks and jobs, the documented default for spark.sql.session.timeZone is Etc/UTC.

Because of that, the recommended approach for ETL is to convert the source timestamps explicitly during ingestion instead of relying on a workspace default. If your source value 2026-01-15 06:00:00 means "6 AM in New York", convert it to UTC on write using to_utc_timestamp in SQL or the PySpark equivalent, so that it is stored as the equivalent instant, which would be 2026-01-15 11:00:00 UTC during standard time or 2026-01-15 10:00:00 UTC during daylight saving time, depending on the date. Use a region-based time zone such as America/New_York rather than a short form like EST, because Databricks documents that short names can be ambiguous in the TIMEZONE parameter docs.

Sample SQL:

-- raw_ts is a local wall-clock timestamp from the source system
-- Example: 2026-01-15 06:00:00 means 6 AM in America/New_York

INSERT INTO main.prod.target_table
SELECT
  id,
  to_utc_timestamp(CAST(raw_ts AS TIMESTAMP), 'America/New_York') AS event_ts_utc
FROM main.prod.source_table;

If you want to verify the conversion for a sample value..

SELECT
  CAST('2026-01-15 06:00:00' AS TIMESTAMP) AS source_local_time,
  to_utc_timestamp(CAST('2026-01-15 06:00:00' AS TIMESTAMP), 'America/New_York') AS stored_utc_time;

Python sample:


from pyspark.sql import functions as F

df_out = (
    df
    .withColumn(
        "event_ts_utc",
        F.to_utc_timestamp(F.to_timestamp("raw_ts"), "America/New_York")
    )
)

df_out.write.format("delta").mode("append").saveAsTable("main.prod.target_table")

Any change you make to SQL settings, session settings, or ETL logic will only affect new reads and writes going forward. It will not retroactively fix data that has already been ingested incorrectly. Existing rows would need to be reprocessed or backfilled if you want the stored values corrected.

If this answer resolves your question, could you mark it as “Accept as Solution”? That helps other users quickly find the correct fix.

Regards,
Ashwin | Delivery Solution Architect @ Databricks
Helping you build and scale the Data Intelligence Platform.
***Opinions are my own***

MDAtl · 3 weeks ago

Hi @Ashwin_DSA

Thank you so much for your response and a detailed explanation.

So having to transform each instance of timestamp column using to_utc_timestamp is not very scalable. We have a few thousand tables that we're looking to move to databricks.

Setting at compute level helps some but we're also looking to move some future ETL processes to use serverless compute. My main concern is if we cannot set the default at a higher level (Workspace), its prone to someone forgetting to set it in their session or computes and we end up with mix of good and bad timestamps.

Is there a technical reason why the setting is not available at workspace level?

Thanks,

MD

Ashwin_DSA · 3 weeks ago

Hi @MDAtl,

Thanks for following up. I did some research on this and couldn't find any publicly documented rationale for why Databricks doesn't expose a workspace-wide default timezone setting across all compute.

I did, however, find similar community discussions asking for related capabilities, so this does not seem to be an isolated requirement.

Given the scale you described and the risk of inconsistent handling across classic and serverless pipelines, this may be worth raising as a feature request through your Databricks account team. I'll also raise it internally on my end, but I want to set expectations clearly that this is not a definite commitment to implementation. Prioritisation typically depends on overall customer demand, business impact, and roadmap considerations.

According to the public documentation today, the supported model is that Databricks SQL can have broader SQL warehouse-level settings via the TIMEZONE parameter, whereas notebooks and jobs rely on session- or compute-scoped settings, and serverless documents use spark.sql.session.timeZone with a default of Etc/UTC.

If preserving local wall-clock time is the main requirement, one option worth evaluating is TIMESTAMP_NTZ, since standard TIMESTAMP values represent an absolute instant and are persisted in UTC. This is still Public Preview though.

Like I said before... even if you standardise the approach going forward, it will not retroactively fix data that has already been ingested with the wrong timezone interpretation. Existing data would need to be reprocessed or backfilled if you want those stored values corrected.

Hope this helps.

If this answer resolves your question, could you mark it as “Accept as Solution”? That helps other users quickly find the correct fix.

Regards,
Ashwin | Delivery Solution Architect @ Databricks
Helping you build and scale the Data Intelligence Platform.
***Opinions are my own***

Databricks Community

Setting timezone at a workspace level

🌟 Community Pulse: Your Weekly Roundup! July 06 – 12, 2026

Upcoming Community BrickTalk | Sports Analytics: Turning Tracking Data into Real-Time AI Decisions

How to Optimize Your Content for GEO: Best Practices for Writing Discoverable Community Content

Solution Accelerator Series | Building Common Sense Product Recommendations With LLMs

Databricks Community Fellows – June 2026 Recap