cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Timestamp in databricks are getting converted to different timezone

Dinu2
New Contributor III

Timestamp columns which are extracted from source databases using jdbc read are getting converted to different timezone and is not matching with source timestamp. Could anyone suggest how can we get same timestamp data like source data?

7 REPLIES 7

-werners-
Esteemed Contributor III

can you check this?

Dinu2
New Contributor III

Thank You, if I am keeping cluster timezone as "UTC" it is converting all the timestamp to UTC. What I am looking is to get timestamp field value in source same in dataframe. Could you please let me know if you have any suggestions?

Anonymous
Not applicable

@Werner Stinckensโ€‹ : Would you like to take this ahead? ๐Ÿ™‚

Anonymous
Not applicable

@Dinu Sukumaraโ€‹ : My take on your question -

If you want to preserve the timestamp values exactly as they are in the source database, without any timezone conversion, you can follow these steps:

  1. Set the Cluster Timezone: Keep the cluster timezone as UTC, as you mentioned.
  2. Adjust Session Timezone: Before reading the data from the source database, you can adjust the session timezone in Databricks to match the timezone of the source data.
spark.conf.set("spark.sql.session.timeZone", "<source_timezone>")

3 . Read Data from Source: Use the JDBC read functionality in Databricks to extract the data from the source database.

df = spark.read.format("jdbc").option("url", "<jdbc_url>").option("dbtable", "<table_name>").load()

-werners-
Esteemed Contributor III

Besides the options mentioned earlier there is also the convert_timezone function.

Anonymous
Not applicable

Hi @Dinu Sukumaraโ€‹ 

We haven't heard from you since the last response from @Werner Stinckensโ€‹ โ€‹ . Kindly share the information with us, and in return, we will provide you with the necessary solution.

Thanks and Regards

Dinu2
New Contributor III

Hi @Vidula Khannaโ€‹ @Suteja Kanuriโ€‹ @Werner Stinckensโ€‹ after changing the tz on cluster level to UTC I am getting the source data as expected, but where I am using current timestamp in databricks I need to use from_utc_timestamp function. that requires more changes. So I am still checking if there are any other workarounds? Also can we give TZ o databricks database level or so? Any suggestions please?

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.