<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic exclude (not like) filter using pyspark in Get Started Discussions</title>
    <link>https://community.databricks.com/t5/get-started-discussions/exclude-not-like-filter-using-pyspark/m-p/89234#M8291</link>
    <description>&lt;P&gt;I am trying to exclude rows with a specific variable when querying using pyspark but the filter is not working.&amp;nbsp; Similar to the "Not like" function in SQL.&amp;nbsp; e.g. not like '%var4%'.&amp;nbsp; The part of the code that is not working is:&amp;nbsp;(col('col4').rlike('var4') == False)&lt;BR /&gt;Code:&lt;/P&gt;&lt;P&gt;%python&lt;BR /&gt;from pyspark.sql.functions import col&lt;/P&gt;&lt;P&gt;fc_run = spark.table("tbl1")&lt;/P&gt;&lt;P&gt;flowchart_run = fc_run.select(&lt;BR /&gt;'col1',&lt;BR /&gt;'col2',&lt;BR /&gt;'col3',&lt;BR /&gt;'col4',&lt;/P&gt;&lt;P&gt;).filter(&lt;BR /&gt;&amp;nbsp; (col('col1') == 'var1') &amp;amp;&lt;BR /&gt;(col('col2').rlike('var2')) &amp;amp;&lt;BR /&gt;(col('col3').rlike('var3')) &amp;amp;&lt;BR /&gt;(col('col4').rlike('var4') == False)&lt;/P&gt;&lt;P&gt;)&lt;/P&gt;&lt;P&gt;display(flowchart_run)&lt;/P&gt;</description>
    <pubDate>Tue, 10 Sep 2024 02:36:22 GMT</pubDate>
    <dc:creator>abueno</dc:creator>
    <dc:date>2024-09-10T02:36:22Z</dc:date>
    <item>
      <title>exclude (not like) filter using pyspark</title>
      <link>https://community.databricks.com/t5/get-started-discussions/exclude-not-like-filter-using-pyspark/m-p/89234#M8291</link>
      <description>&lt;P&gt;I am trying to exclude rows with a specific variable when querying using pyspark but the filter is not working.&amp;nbsp; Similar to the "Not like" function in SQL.&amp;nbsp; e.g. not like '%var4%'.&amp;nbsp; The part of the code that is not working is:&amp;nbsp;(col('col4').rlike('var4') == False)&lt;BR /&gt;Code:&lt;/P&gt;&lt;P&gt;%python&lt;BR /&gt;from pyspark.sql.functions import col&lt;/P&gt;&lt;P&gt;fc_run = spark.table("tbl1")&lt;/P&gt;&lt;P&gt;flowchart_run = fc_run.select(&lt;BR /&gt;'col1',&lt;BR /&gt;'col2',&lt;BR /&gt;'col3',&lt;BR /&gt;'col4',&lt;/P&gt;&lt;P&gt;).filter(&lt;BR /&gt;&amp;nbsp; (col('col1') == 'var1') &amp;amp;&lt;BR /&gt;(col('col2').rlike('var2')) &amp;amp;&lt;BR /&gt;(col('col3').rlike('var3')) &amp;amp;&lt;BR /&gt;(col('col4').rlike('var4') == False)&lt;/P&gt;&lt;P&gt;)&lt;/P&gt;&lt;P&gt;display(flowchart_run)&lt;/P&gt;</description>
      <pubDate>Tue, 10 Sep 2024 02:36:22 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/exclude-not-like-filter-using-pyspark/m-p/89234#M8291</guid>
      <dc:creator>abueno</dc:creator>
      <dc:date>2024-09-10T02:36:22Z</dc:date>
    </item>
    <item>
      <title>Re: exclude (not like) filter using pyspark</title>
      <link>https://community.databricks.com/t5/get-started-discussions/exclude-not-like-filter-using-pyspark/m-p/89237#M8292</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/114593"&gt;@abueno&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;
&lt;P&gt;To replicate a SQL `not like '%var4%'` clause in the Dataframe API, you could use `rlike` with negation using `~` such as:&lt;/P&gt;
&lt;LI-CODE lang="python"&gt;df.filter(~col('col4').rlike('var4')).display()&lt;/LI-CODE&gt;
&lt;P&gt;Here's a basic reproducible example:&lt;/P&gt;
&lt;LI-CODE lang="python"&gt;df = (spark.range(10).withColumn("col4", f.lit("var3"))).union(
      spark.range(10).withColumn("col4", f.lit("var4")))

df.filter(~col('col4').rlike('var4')).groupBy('col4').count().display()
col4	count
var3	10&lt;/LI-CODE&gt;
&lt;P&gt;Hope this helps.&lt;/P&gt;</description>
      <pubDate>Tue, 10 Sep 2024 03:52:56 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/exclude-not-like-filter-using-pyspark/m-p/89237#M8292</guid>
      <dc:creator>brockb</dc:creator>
      <dc:date>2024-09-10T03:52:56Z</dc:date>
    </item>
    <item>
      <title>Re: exclude (not like) filter using pyspark</title>
      <link>https://community.databricks.com/t5/get-started-discussions/exclude-not-like-filter-using-pyspark/m-p/89295#M8293</link>
      <description>&lt;P&gt;Worked perfectly Thank you.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 10 Sep 2024 13:07:05 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/exclude-not-like-filter-using-pyspark/m-p/89295#M8293</guid>
      <dc:creator>abueno</dc:creator>
      <dc:date>2024-09-10T13:07:05Z</dc:date>
    </item>
  </channel>
</rss>

