โ01-20-2022 04:05 AM
The image below shows what my source data is (HAVE) and what I'm trying to get to (WANT).
I want to be able to calculate the percentage of bad messages (where formattedMessage = false) by source and date.
I'm not sure how to achieve this in DatabricksSql. Any help appreciated.
โ01-21-2022 12:28 AM
you could use a windows function over source and date with a sum of messagecount. This gives you the total per source/date repeated on every line.
Then apply a filter on formattedmessage == false and divide messagecount by the sum above.
โ01-20-2022 07:40 AM
Thank you so much
โ01-21-2022 12:28 AM
you could use a windows function over source and date with a sum of messagecount. This gives you the total per source/date repeated on every line.
Then apply a filter on formattedmessage == false and divide messagecount by the sum above.
โ01-21-2022 06:09 AM
Thank you, I was able to get it following your instructions๐
Passionate about hosting events and connecting people? Help us grow a vibrant local communityโsign up today to get started!
Sign Up Now