I am trying to exclude rows with a specific variable when querying using pyspark but the filter is not working. Similar to the "Not like" function in SQL. e.g. not like '%var4%'. The part of the code that is not working is: (col('col4').rlike('var4') == False)
Code:
%python
from pyspark.sql.functions import col
fc_run = spark.table("tbl1")
flowchart_run = fc_run.select(
'col1',
'col2',
'col3',
'col4',
).filter(
(col('col1') == 'var1') &
(col('col2').rlike('var2')) &
(col('col3').rlike('var3')) &
(col('col4').rlike('var4') == False)
)
display(flowchart_run)