Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
Hi,I have multiple datasets in my data lake that feature valid_from and valid_to columns indicating validity of rows.If a row is valid currently, this is indicated by valid_to=9999-12-31 00:00:00.Example:Loading this into a Spark dataframe works fine...
Be aware, that in Databricks 15.2 LTS this behavior is broken.I cannot find the code, but most likely related to the following option:https://github.com/apache/spark/commit/c1c710e7da75b989f4d14e84e85f336bc10920e0#diff-f9ddcc6cba651c6ebfd34e29ef049c3...
I have a field stored as a string in the format "12/30/2022 10:30:00 AM"If I use the function TO_DATE, I only get the date part... I want the full date and time.If I use the function TO_TIMESTAMP, I get the date and time, but it's assumed to be UTC, ...
spark 3.4 and databricks 13 introduce two new types of timestamps for handling time zone information:- TIMESTAMP WITH LOCAL TIME ZONE: This type assumes that the input data is in the session's local time zone and converts it to UTC before processing....
I have data that looks like this:2021-11-25T19:00:00.000-0500
2021-03-03T13:00:00.000-0500
2021-03-09T15:00:00.000-0500
2021-03-13T16:00:00.000-0500
2021-03-19T03:00:00.000-0400
2021-05-28T03:00:00.000-0400which is accurate, except I'm pulling the da...
Hi @Jonathan Dufault Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us...
For some reason, the parameter keeps showing me my local time in UTC. When converting this parameter from EST, the offset is wrong. In the screenshot, you can see that my local session TZ is set to ETC/UTC (for whatever reason). It looks like the par...
Hi @Zachary Higgins Hope everything is going great.Does @Prasad Wagh response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!
Hello,We are using a Azure Databricks with Standard DS14_V2 Cluster with Runtime 9.1 LTS, Spark 3.1.2 and Scala 2.12 and facing the below issue frequently when running our ETL pipeline. As part of the operation that is failing there are several joins...
Hey man,Please use these configuration in your cluster and it will work,spark.sql.storeAssignmentPolicy LEGACYspark.sql.parquet.binaryAsString truespark.speculation falsespark.sql.legacy.timeParserPolicy LEGACYif it wont work let me know what problem...
Hi,I found the bug while using in "from_utc_timestamp" function while using from UTC time stamp to EST time stampBelow is the Query Query:select trim(current_timestamp()) as Current,trim(from_utc_timestamp(current_timestamp(),'EST')) as EST,trim(from...
Hi all,there is a random error when pushing data from Databricks to a Azure SQL Database.Anyone else also had this problem? Any ideas are appreciated.See stacktrace attached.Target: Azure SQL Database, Standard S6: 400 DTUsDatabricks Cluster config:"...
What timezone is the “timestamp” value in the Databricks Usage log ?Is it UTC?timestamp2020-12-01T00:59:59.000ZNeed to match this to AWS Cost Explorer timezone for simplicity.It's UTC.Please see timestamp under Audit Log Schema https://docs.databrick...