<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Traversing to previous rows and getting the data based on condition in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/traversing-to-previous-rows-and-getting-the-data-based-on/m-p/64872#M32687</link>
    <description>&lt;P&gt;This event data does not have specific patter, I can not group it based on interval. Only option what is see is self join or looping. But i want to avoid it, is there any other option for given data set?&lt;/P&gt;</description>
    <pubDate>Thu, 28 Mar 2024 05:18:02 GMT</pubDate>
    <dc:creator>RajNath</dc:creator>
    <dc:date>2024-03-28T05:18:02Z</dc:date>
    <item>
      <title>Traversing to previous rows and getting the data based on condition</title>
      <link>https://community.databricks.com/t5/data-engineering/traversing-to-previous-rows-and-getting-the-data-based-on/m-p/64506#M32589</link>
      <description>&lt;P&gt;Sample Input data set&lt;/P&gt;&lt;TABLE width="556px"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD width="178.406px" height="30px"&gt;ClusterId&lt;/TD&gt;&lt;TD width="120.031px" height="30px"&gt;Event&lt;/TD&gt;&lt;TD width="256.562px" height="30px"&gt;EventTime&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="178.406px" height="57px"&gt;1212-18-r9u1kzn1&lt;/TD&gt;&lt;TD width="120.031px" height="57px"&gt;RUNNING&lt;/TD&gt;&lt;TD width="256.562px" height="57px"&gt;2024-02-02T11:38:30.168+00:00&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="178.406px" height="57px"&gt;1212-18-r9u1kzn1&lt;/TD&gt;&lt;TD width="120.031px" height="57px"&gt;TERMINATING&lt;/TD&gt;&lt;TD width="256.562px" height="57px"&gt;2024-02-02T13:43:33.933+00:00&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="178.406px" height="57px"&gt;1212-18-r9u1kzn1&lt;/TD&gt;&lt;TD width="120.031px" height="57px"&gt;STARTING&lt;/TD&gt;&lt;TD width="256.562px" height="57px"&gt;2024-02-02T15:50:05.174+00:00&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="178.406px" height="57px"&gt;1212-18-r9u1kzn1&lt;/TD&gt;&lt;TD width="120.031px" height="57px"&gt;RUNNING&lt;/TD&gt;&lt;TD width="256.562px" height="57px"&gt;2024-02-02T15:54:21.510+00:00&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="178.406px" height="57px"&gt;1212-18-r9u1kzn1&lt;/TD&gt;&lt;TD width="120.031px" height="57px"&gt;RUNNING&lt;/TD&gt;&lt;TD width="256.562px" height="57px"&gt;2024-02-02T16:09:20.576+00:00&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="178.406px" height="57px"&gt;1212-18-r9u1kzn1&lt;/TD&gt;&lt;TD width="120.031px" height="57px"&gt;RUNNING&lt;/TD&gt;&lt;TD width="256.562px" height="57px"&gt;2024-02-02T16:19:58.744+00:00&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="178.406px" height="57px"&gt;1212-18-r9u1kzn1&lt;/TD&gt;&lt;TD width="120.031px" height="57px"&gt;TERMINATING&lt;/TD&gt;&lt;TD width="256.562px" height="57px"&gt;2024-02-02T17:18:33.863+00:00&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="178.406px" height="57px"&gt;1212-18-r9u1kzn1&lt;/TD&gt;&lt;TD width="120.031px" height="57px"&gt;STARTING&lt;/TD&gt;&lt;TD width="256.562px" height="57px"&gt;2024-02-02T17:22:38.635+00:00&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="178.406px" height="57px"&gt;1212-18-r9u1kzn1&lt;/TD&gt;&lt;TD width="120.031px" height="57px"&gt;RUNNING&lt;/TD&gt;&lt;TD width="256.562px" height="57px"&gt;2024-02-02T17:23:40.781+00:00&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="178.406px" height="57px"&gt;1212-18-r9u1kzn1&lt;/TD&gt;&lt;TD width="120.031px" height="57px"&gt;TERMINATING&lt;/TD&gt;&lt;TD width="256.562px" height="57px"&gt;2024-02-02T18:03:33.953+00:00&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="178.406px" height="57px"&gt;1212-18-r9u1kzn1&lt;/TD&gt;&lt;TD width="120.031px" height="57px"&gt;STARTING&lt;/TD&gt;&lt;TD width="256.562px" height="57px"&gt;2024-02-02T21:10:21.651+00:00&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="178.406px" height="57px"&gt;1212-18-r9u1kzn1&lt;/TD&gt;&lt;TD width="120.031px" height="57px"&gt;RUNNING&lt;/TD&gt;&lt;TD width="256.562px" height="57px"&gt;2024-02-02T21:13:59.842+00:00&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="178.406px" height="57px"&gt;1212-18-r9u1kzn1&lt;/TD&gt;&lt;TD width="120.031px" height="57px"&gt;TERMINATING&lt;/TD&gt;&lt;TD width="256.562px" height="57px"&gt;2024-02-02T22:43:34.022+00:00&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;Below is sample expected output. In this RunningEventTime will show the event time corresponding to the previous running event time for the event "TERMINATING". In case "STARTING" event is present then for that event time should be showing in "StartingEventTime" column.&lt;/P&gt;&lt;TABLE width="768"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD width="77.5156px"&gt;ClusterId&lt;/TD&gt;&lt;TD width="120.031px"&gt;Event&lt;/TD&gt;&lt;TD width="207.875px"&gt;EventTime&lt;/TD&gt;&lt;TD width="180.188px"&gt;RunningEventTime&lt;/TD&gt;&lt;TD width="181.391px"&gt;StartingEventTime&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="77.5156px"&gt;1212-18-r9u1kzn1&lt;/TD&gt;&lt;TD width="120.031px"&gt;RUNNING&lt;/TD&gt;&lt;TD width="207.875px"&gt;2024-02-02T11:38:30.168+00:00&lt;/TD&gt;&lt;TD width="180.188px"&gt;&amp;nbsp;&lt;/TD&gt;&lt;TD width="181.391px"&gt;&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="77.5156px"&gt;1212-18-r9u1kzn1&lt;/TD&gt;&lt;TD width="120.031px"&gt;TERMINATING&lt;/TD&gt;&lt;TD width="207.875px"&gt;2024-02-02T13:43:33.933+00:00&lt;/TD&gt;&lt;TD width="180.188px"&gt;2024-02-02T11:38:30.168+00:00&lt;/TD&gt;&lt;TD width="181.391px"&gt;&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="77.5156px"&gt;1212-18-r9u1kzn1&lt;/TD&gt;&lt;TD width="120.031px"&gt;STARTING&lt;/TD&gt;&lt;TD width="207.875px"&gt;2024-02-02T15:50:05.174+00:00&lt;/TD&gt;&lt;TD width="180.188px"&gt;&amp;nbsp;&lt;/TD&gt;&lt;TD width="181.391px"&gt;&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="77.5156px"&gt;1212-18-r9u1kzn1&lt;/TD&gt;&lt;TD width="120.031px"&gt;RUNNING&lt;/TD&gt;&lt;TD width="207.875px"&gt;2024-02-02T15:54:21.510+00:00&lt;/TD&gt;&lt;TD width="180.188px"&gt;&amp;nbsp;&lt;/TD&gt;&lt;TD width="181.391px"&gt;&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="77.5156px"&gt;1212-18-r9u1kzn1&lt;/TD&gt;&lt;TD width="120.031px"&gt;RUNNING&lt;/TD&gt;&lt;TD width="207.875px"&gt;2024-02-02T16:09:20.576+00:00&lt;/TD&gt;&lt;TD width="180.188px"&gt;&amp;nbsp;&lt;/TD&gt;&lt;TD width="181.391px"&gt;&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="77.5156px"&gt;1212-18-r9u1kzn1&lt;/TD&gt;&lt;TD width="120.031px"&gt;RUNNING&lt;/TD&gt;&lt;TD width="207.875px"&gt;2024-02-02T16:19:58.744+00:00&lt;/TD&gt;&lt;TD width="180.188px"&gt;&amp;nbsp;&lt;/TD&gt;&lt;TD width="181.391px"&gt;&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="77.5156px"&gt;1212-18-r9u1kzn1&lt;/TD&gt;&lt;TD width="120.031px"&gt;TERMINATING&lt;/TD&gt;&lt;TD width="207.875px"&gt;2024-02-02T17:18:33.863+00:00&lt;/TD&gt;&lt;TD width="180.188px"&gt;&amp;nbsp;&lt;/TD&gt;&lt;TD width="181.391px"&gt;2024-02-02T15:50:05.174+00:00&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="77.5156px"&gt;1212-18-r9u1kzn1&lt;/TD&gt;&lt;TD width="120.031px"&gt;STARTING&lt;/TD&gt;&lt;TD width="207.875px"&gt;2024-02-02T17:22:38.635+00:00&lt;/TD&gt;&lt;TD width="180.188px"&gt;&amp;nbsp;&lt;/TD&gt;&lt;TD width="181.391px"&gt;&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="77.5156px"&gt;1212-18-r9u1kzn1&lt;/TD&gt;&lt;TD width="120.031px"&gt;RUNNING&lt;/TD&gt;&lt;TD width="207.875px"&gt;2024-02-02T17:23:40.781+00:00&lt;/TD&gt;&lt;TD width="180.188px"&gt;&amp;nbsp;&lt;/TD&gt;&lt;TD width="181.391px"&gt;&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="77.5156px"&gt;1212-18-r9u1kzn1&lt;/TD&gt;&lt;TD width="120.031px"&gt;TERMINATING&lt;/TD&gt;&lt;TD width="207.875px"&gt;2024-02-02T18:03:33.953+00:00&lt;/TD&gt;&lt;TD width="180.188px"&gt;&amp;nbsp;&lt;/TD&gt;&lt;TD width="181.391px"&gt;2024-02-02T17:22:38.635+00:00&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="77.5156px"&gt;1212-18-r9u1kzn1&lt;/TD&gt;&lt;TD width="120.031px"&gt;STARTING&lt;/TD&gt;&lt;TD width="207.875px"&gt;2024-02-02T21:10:21.651+00:00&lt;/TD&gt;&lt;TD width="180.188px"&gt;&amp;nbsp;&lt;/TD&gt;&lt;TD width="181.391px"&gt;&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="77.5156px"&gt;1212-18-r9u1kzn1&lt;/TD&gt;&lt;TD width="120.031px"&gt;RUNNING&lt;/TD&gt;&lt;TD width="207.875px"&gt;2024-02-02T21:13:59.842+00:00&lt;/TD&gt;&lt;TD width="180.188px"&gt;&amp;nbsp;&lt;/TD&gt;&lt;TD width="181.391px"&gt;&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="77.5156px"&gt;1212-18-r9u1kzn1&lt;/TD&gt;&lt;TD width="120.031px"&gt;TERMINATING&lt;/TD&gt;&lt;TD width="207.875px"&gt;2024-02-02T22:43:34.022+00:00&lt;/TD&gt;&lt;TD width="180.188px"&gt;&amp;nbsp;&lt;/TD&gt;&lt;TD width="181.391px"&gt;2024-02-02T21:10:21.651+00:00&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;I tried few option such using self join but that is not ideal when data set is large. Another option i tried is looping but here also same problem. It will not be good for large data sets. I tried windowing function "lag" but could not make it work. Any suggestion or hint would be really helpful.&lt;/P&gt;</description>
      <pubDate>Mon, 25 Mar 2024 11:10:40 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/traversing-to-previous-rows-and-getting-the-data-based-on/m-p/64506#M32589</guid>
      <dc:creator>RajNath</dc:creator>
      <dc:date>2024-03-25T11:10:40Z</dc:date>
    </item>
    <item>
      <title>Re: Traversing to previous rows and getting the data based on condition</title>
      <link>https://community.databricks.com/t5/data-engineering/traversing-to-previous-rows-and-getting-the-data-based-on/m-p/64872#M32687</link>
      <description>&lt;P&gt;This event data does not have specific patter, I can not group it based on interval. Only option what is see is self join or looping. But i want to avoid it, is there any other option for given data set?&lt;/P&gt;</description>
      <pubDate>Thu, 28 Mar 2024 05:18:02 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/traversing-to-previous-rows-and-getting-the-data-based-on/m-p/64872#M32687</guid>
      <dc:creator>RajNath</dc:creator>
      <dc:date>2024-03-28T05:18:02Z</dc:date>
    </item>
  </channel>
</rss>

