โ01-16-2023 05:29 PM
in my dataframe it have one column name like count, if that particular column value is greater than zero, the job needs to get failed, how can i perform that one?โ
โ01-17-2023 04:21 AM
โ01-16-2023 07:26 PM
Hi @Mohammed sadamuseanโ
Can you try like below code in pyspark and let me know if you face any issues
variable_name = df.select(col("Column_Name")).collect()[0][0]
if(variable_name>0):
dbutils.notebook.exit('Notebook Failed')
Happy Learning!!
โ01-17-2023 01:44 AM
Code without collect, which should not be used in production:
if df.filter("count > 0").count() > 0: dbutils.notebook.exit('Notebook Failed')
you can also use a more aggressive version:
if df.filter("count > 0").count() > 0: raise Exception("count bigger than 0")
โ01-17-2023 03:14 AM
but it will get total count of the column rightโ, but i need to check every specific column value
โ01-17-2023 04:02 AM
first you filter for rows matching your query. You said that column is named count. Let's assume that column is called col instead, so filter("col > 0"), and then you apply the count() function, which will return how many rows match those criteria.
โ01-17-2023 04:17 AM
it is working but โhow can we check the columns based on two values like count >0 and less than 0 , i tried with equal to 0 but it doesn't worked
โ01-17-2023 04:21 AM
just put like in SQL
"colname > 0 OR colname< 0"
or
"colname != 0"
Passionate about hosting events and connecting people? Help us grow a vibrant local communityโsign up today to get started!
Sign Up Now