cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Help trying to calculate a percentage

DanVartanian
New Contributor II

The image below shows what my source data is (HAVE) and what I'm trying to get to (WANT).

I want to be able to calculate the percentage of bad messages (where formattedMessage = false) by source and date.

I'm not sure how to achieve this in DatabricksSql. Any help appreciated.

havewant 

1 ACCEPTED SOLUTION

Accepted Solutions

-werners-
Esteemed Contributor III

you could use a windows function over source and date with a sum of messagecount. This gives you the total per source/date repeated on every line.

Then apply a filter on formattedmessage == false and divide messagecount by the sum above.

View solution in original post

3 REPLIES 3

Thank you so much

-werners-
Esteemed Contributor III

you could use a windows function over source and date with a sum of messagecount. This gives you the total per source/date repeated on every line.

Then apply a filter on formattedmessage == false and divide messagecount by the sum above.

Thank you, I was able to get it following your instructions😀

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now