โ01-16-2023 05:29 PM
in my dataframe it have one column name like count, if that particular column value is greater than zero, the job needs to get failed, how can i perform that one?โ
โ01-17-2023 04:21 AM
โ01-16-2023 07:26 PM
Hi @Mohammed sadamuseanโ
Can you try like below code in pyspark and let me know if you face any issues
variable_name = df.select(col("Column_Name")).collect()[0][0]
if(variable_name>0):
dbutils.notebook.exit('Notebook Failed')
Happy Learning!!
โ01-17-2023 01:44 AM
Code without collect, which should not be used in production:
if df.filter("count > 0").count() > 0: dbutils.notebook.exit('Notebook Failed')
you can also use a more aggressive version:
if df.filter("count > 0").count() > 0: raise Exception("count bigger than 0")
โ01-17-2023 03:14 AM
but it will get total count of the column rightโ, but i need to check every specific column value
โ01-17-2023 04:02 AM
first you filter for rows matching your query. You said that column is named count. Let's assume that column is called col instead, so filter("col > 0"), and then you apply the count() function, which will return how many rows match those criteria.
โ01-17-2023 04:17 AM
it is working but โhow can we check the columns based on two values like count >0 and less than 0 , i tried with equal to 0 but it doesn't worked
โ01-17-2023 04:21 AM
just put like in SQL
"colname > 0 OR colname< 0"
or
"colname != 0"
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group