Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-23-2022 02:41 PM
@Hubert Dudek , So let me backtrack one second, when I create the inner join I also put some conditions:
data_to_save =(
table1
.join(table2,
(table1.user == table2.user)
& (col("eventTime") >= col("tableTime"))
& (datediff(table1.eventDate,table2.tableEventDate) <= window)
,"inner")
)This is the table I want to write to a delta table.
The code snippet you put up, is this something I need to put in that write statement I had or prior in the join statement which I have pasted above?
As for your question:
Each row may not have one unique id. For example lets consider 10 rows - in those 10 rows, we may have 3 rows which have in one column called "user Id" to be the same but in another column called "eventId" be different. For me it is important to keep these rows separate due to having the same user id. I hope this answers your question.