<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How can I add a duration in milliseconds to a timestamp? in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/how-can-i-add-a-duration-in-milliseconds-to-a-timestamp/m-p/10068#M5313</link>
    <description>&lt;P&gt;Hi @Ivo Merchiers​&amp;nbsp;, &lt;/P&gt;&lt;P&gt;Here is how I did it. As you mentioned, I am considering a date with milliseconds as input in "ts" column and offset to be added in "offSetMillis" column. First of all, I converted the "ts" column to milliseconds and then added "offSetMillis" to it and finally converted this new value back to timestamp in "new_ts" column &lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="Screenshot 2023-02-06 at 6.50.51 PM"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/723iC66D04EA26AB8424/image-size/large?v=v2&amp;amp;px=999" role="button" title="Screenshot 2023-02-06 at 6.50.51 PM" alt="Screenshot 2023-02-06 at 6.50.51 PM" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
    <pubDate>Mon, 06 Feb 2023 13:26:04 GMT</pubDate>
    <dc:creator>Lakshay</dc:creator>
    <dc:date>2023-02-06T13:26:04Z</dc:date>
    <item>
      <title>How can I add a duration in milliseconds to a timestamp?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-can-i-add-a-duration-in-milliseconds-to-a-timestamp/m-p/10065#M5310</link>
      <description>&lt;P&gt;Let's say I have a DataFrame with a timestamp and an offset column in milliseconds respectively in the timestamp and long format. &lt;/P&gt;&lt;P&gt;E.g.&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;from datetime import datetime
df = spark.createDataFrame(
    [
        (datetime(2021, 1, 1), 1500, ),
        (datetime(2021, 1, 2), 1200, )
    ],
    ["timestamp", "offsetmillis", ],
)&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Now I want to add these offsets to the datetime, so that I get:&lt;/P&gt;&lt;P&gt;2021-01-01T00:00:01.500 and 2021-01-0T00:00:01.200&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;If I add these directly I get an error about type mismatch, which does make sense:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;[DATATYPE_MISMATCH.BINARY_OP_DIFF_TYPES] Cannot resolve "(timestamp + offsetmillis)" due to data type mismatch: the left and right operands of the binary operator have incompatible types ("TIMESTAMP" and "BIGINT")&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;However I'm not sure how I can best cast this to a duration or interval.&lt;/P&gt;</description>
      <pubDate>Fri, 03 Feb 2023 15:34:30 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-can-i-add-a-duration-in-milliseconds-to-a-timestamp/m-p/10065#M5310</guid>
      <dc:creator>Merchiv</dc:creator>
      <dc:date>2023-02-03T15:34:30Z</dc:date>
    </item>
    <item>
      <title>Re: How can I add a duration in milliseconds to a timestamp?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-can-i-add-a-duration-in-milliseconds-to-a-timestamp/m-p/10066#M5311</link>
      <description>&lt;P&gt;Hi @Ivo Merchiers​&amp;nbsp;, If you are just trying to create a date with milliseconds, you can create them directly by providing the value in datetime as below.&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="Screenshot 2023-02-04 at 12.28.02 AM"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/720iDE03A75318FCC196/image-size/large?v=v2&amp;amp;px=999" role="button" title="Screenshot 2023-02-04 at 12.28.02 AM" alt="Screenshot 2023-02-04 at 12.28.02 AM" /&gt;&lt;/span&gt;However, if your usecase is to add milliseconds to the date value then you have to convert the date to milliseconds before adding milliseconds to it.&lt;/P&gt;</description>
      <pubDate>Fri, 03 Feb 2023 19:02:05 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-can-i-add-a-duration-in-milliseconds-to-a-timestamp/m-p/10066#M5311</guid>
      <dc:creator>Lakshay</dc:creator>
      <dc:date>2023-02-03T19:02:05Z</dc:date>
    </item>
    <item>
      <title>Re: How can I add a duration in milliseconds to a timestamp?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-can-i-add-a-duration-in-milliseconds-to-a-timestamp/m-p/10067#M5312</link>
      <description>&lt;P&gt;Hi @Lakshay Goel​, &lt;/P&gt;&lt;P&gt;I've just added the `spark.createDataFrame` command here as an example, the real data is coming from some existing tables, so I can't do it in the python initialisation.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I want to do the addition of some milliseconds (in integer/long/whatever) format to a timestamp (which should already have milliseconds precision) &lt;B&gt;in&lt;/B&gt; Pyspark.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;How would I go about doing the second approach you proposed?&lt;/P&gt;</description>
      <pubDate>Mon, 06 Feb 2023 07:56:03 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-can-i-add-a-duration-in-milliseconds-to-a-timestamp/m-p/10067#M5312</guid>
      <dc:creator>Merchiv</dc:creator>
      <dc:date>2023-02-06T07:56:03Z</dc:date>
    </item>
    <item>
      <title>Re: How can I add a duration in milliseconds to a timestamp?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-can-i-add-a-duration-in-milliseconds-to-a-timestamp/m-p/10068#M5313</link>
      <description>&lt;P&gt;Hi @Ivo Merchiers​&amp;nbsp;, &lt;/P&gt;&lt;P&gt;Here is how I did it. As you mentioned, I am considering a date with milliseconds as input in "ts" column and offset to be added in "offSetMillis" column. First of all, I converted the "ts" column to milliseconds and then added "offSetMillis" to it and finally converted this new value back to timestamp in "new_ts" column &lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="Screenshot 2023-02-06 at 6.50.51 PM"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/723iC66D04EA26AB8424/image-size/large?v=v2&amp;amp;px=999" role="button" title="Screenshot 2023-02-06 at 6.50.51 PM" alt="Screenshot 2023-02-06 at 6.50.51 PM" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 06 Feb 2023 13:26:04 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-can-i-add-a-duration-in-milliseconds-to-a-timestamp/m-p/10068#M5313</guid>
      <dc:creator>Lakshay</dc:creator>
      <dc:date>2023-02-06T13:26:04Z</dc:date>
    </item>
    <item>
      <title>Re: How can I add a duration in milliseconds to a timestamp?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-can-i-add-a-duration-in-milliseconds-to-a-timestamp/m-p/10069#M5314</link>
      <description>&lt;P&gt;Although @Lakshay Goel​'s solution works, we've been using an alternative approach, that we found to be a bit more readable:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;from pyspark.sql import Column, functions as f
&amp;nbsp;
&amp;nbsp;
def make_dt_interval_sec(col: Column):
    return f.expr(f"make_dt_interval(0,0,0,{col._jc.toString()})")
&amp;nbsp;
df.withColumn(
      start_col,
        f.col("timestamp") - make_dt_interval_sec(f.col("offsetmillis") / 1000),
     )&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;I'm not sure if there is any performance difference between both methods.&lt;/P&gt;</description>
      <pubDate>Thu, 02 Mar 2023 07:41:35 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-can-i-add-a-duration-in-milliseconds-to-a-timestamp/m-p/10069#M5314</guid>
      <dc:creator>Merchiv</dc:creator>
      <dc:date>2023-03-02T07:41:35Z</dc:date>
    </item>
  </channel>
</rss>

