Is there a way in Azure to compare data in one field?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-15-2022 02:31 PM
Is there a way to compare a time stamp within on field/column for an individual ID? For example, if I have two records for an ID and the time stamps are within 5 min of each other....I just want to keep the latest. But, for example, if they were an hour apart I would keep both records.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-23-2022 12:51 PM
Windowing function can be what you need.
from pyspark.sql import functions as F
df.groupBy(F.window("event_time","5 minutes"))
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-23-2022 01:12 PM
So, is this done something like this?
SELECT
r.patientmedicalrecordnumber,
r.callreceiveddatetime as date
FROM table r
LEFT OUTER JOIN table p
ON r.pageid = p.pageid
WHERE p.pagetype = 6
and cast(r.callreceiveddatetime as date) = current_date() - 1
df.groupBy (r.window("event_time","5 minutes"))
ORDER BY r.callreceiveddatetime
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-23-2022 08:00 PM
Since you are trying to do this in SQL, I hope someone else can write you the correct answer. The above example is for pyspark. You can check the SQL synax from Databricks documents
![](/skins/images/8C2A30E5B696B676846234E4B14F2C7B/responsive_peak/images/icon_anonymous_message.png)
![](/skins/images/8C2A30E5B696B676846234E4B14F2C7B/responsive_peak/images/icon_anonymous_message.png)