Topics with Label: Groupby Window Queries

by DanVartanian • New Contributor II

01-20-2022 4:05:33 AM

6953 Views
3 replies
0 kudos

Resolved! Help trying to calculate a percentage

The image below shows what my source data is (HAVE) and what I'm trying to get to (WANT).I want to be able to calculate the percentage of bad messages (where formattedMessage = false) by source and date.I'm not sure how to achieve this in DatabricksS...

Data Engineering

6953 Views
3 replies
0 kudos

01-20-2022 4:05:33 AM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

01-21-2022 12:28:02 AM

0 kudos

you could use a windows function over source and date with a sum of messagecount. This gives you the total per source/date repeated on every line.Then apply a filter on formattedmessage == false and divide messagecount by the sum above.

0 kudos

01-21-2022 12:28:02 AM

2 More Replies

by itay • New Contributor II

11-28-2021 9:04:02 AM

2192 Views
2 replies
1 kudos

Streaming with runOnce and groupBy window queries

I have a streaming job running a groupBy query with a Window of 3 days. The query is searching for different types of events.The stream is configured with runOnce and there is a job scheduled for every hour.Now, I'm not sure what data is processed ea...

Data Engineering

2192 Views
2 replies
1 kudos

11-28-2021 9:04:02 AM

View Replies

Latest Reply

jose_gonzalez
Databricks Employee

11-29-2021 10:54:23 AM

1 kudos

Hi @itay k ,You will need to take a look at the Progress Reporter. This will show the Micro-batch JSON metrics. For example, the metric called "numInputRows" which will display the number of input rows that it processed for the micro-batch. You will...

1 kudos

11-29-2021 10:54:23 AM

1 More Replies

Databricks Community

Forum Posts

Resolved! Help trying to calculate a percentage

Streaming with runOnce and groupBy window queries