<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Prakash Hinduja Geneva (Swiss) Can I use tools like Great Expectations with Databricks? in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/prakash-hinduja-geneva-swiss-can-i-use-tools-like-great/m-p/126359#M47679</link>
    <description>&lt;P&gt;Hi everyone,&lt;/P&gt;&lt;P&gt;I am Prakash Hinduja from Geneva, Switzerland (Swiss) currently exploring ways to improve data quality checks in my Databricks pipelines and came across Great Expectations. I’d love to know if anyone here has experience using it with Databricks.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Regards&lt;/P&gt;&lt;P&gt;Prakash Hinduja from Geneva, Switzerland (Swiss)&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Thu, 24 Jul 2025 12:31:49 GMT</pubDate>
    <dc:creator>prakashhinduja1</dc:creator>
    <dc:date>2025-07-24T12:31:49Z</dc:date>
    <item>
      <title>Prakash Hinduja Geneva (Swiss) Can I use tools like Great Expectations with Databricks?</title>
      <link>https://community.databricks.com/t5/data-engineering/prakash-hinduja-geneva-swiss-can-i-use-tools-like-great/m-p/126359#M47679</link>
      <description>&lt;P&gt;Hi everyone,&lt;/P&gt;&lt;P&gt;I am Prakash Hinduja from Geneva, Switzerland (Swiss) currently exploring ways to improve data quality checks in my Databricks pipelines and came across Great Expectations. I’d love to know if anyone here has experience using it with Databricks.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Regards&lt;/P&gt;&lt;P&gt;Prakash Hinduja from Geneva, Switzerland (Swiss)&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 24 Jul 2025 12:31:49 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/prakash-hinduja-geneva-swiss-can-i-use-tools-like-great/m-p/126359#M47679</guid>
      <dc:creator>prakashhinduja1</dc:creator>
      <dc:date>2025-07-24T12:31:49Z</dc:date>
    </item>
    <item>
      <title>Re: Prakash Hinduja Geneva (Swiss) Can I use tools like Great Expectations with Databricks?</title>
      <link>https://community.databricks.com/t5/data-engineering/prakash-hinduja-geneva-swiss-can-i-use-tools-like-great/m-p/126373#M47681</link>
      <description>&lt;P&gt;Hi Prakash,&lt;BR /&gt;Yes, &lt;A href="https://legacy.017.docs.greatexpectations.io/docs/0.15.50/deployment_patterns/how_to_use_great_expectations_in_databricks/" target="_self"&gt;Great Expectations&lt;/A&gt; integrates well with Databricks and is commonly used to enforce data quality checks in pipelines. For example, validating schema, nulls, ranges, or business rules.&lt;/P&gt;
&lt;P&gt;You can use it in a few ways:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;
&lt;P&gt;Directly in Python notebooks using &lt;CODE&gt;%pip install great_expectations&lt;/CODE&gt;&lt;/P&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P&gt;As part of a job or task within a Databricks workflow&lt;/P&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P&gt;Embedded in custom ETL/ELT logic to validate input or output datasets&lt;/P&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P&gt;Optionally generate data docs for reporting and audit&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;That said, if you're using DLT (now part of Lakeflow&lt;STRONG&gt;),&lt;/STRONG&gt;&amp;nbsp;Databricks provides native expectations out of the box. You can define them declaratively like this:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class="language-python"&gt;@dlt.expect("non_null_id", "id IS NOT NULL")
@dlt.expect_or_drop("valid_age", "age BETWEEN 0 AND 120")
def clean_users():
    return spark.read.table("raw.users")
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;These expectations automatically &lt;STRONG&gt;track data quality&lt;/STRONG&gt;, can &lt;STRONG&gt;log violations&lt;/STRONG&gt;, &lt;STRONG&gt;drop invalid records&lt;/STRONG&gt;, or &lt;STRONG&gt;stop the pipeline&lt;/STRONG&gt; entirely, and all results are stored in the DLT event log for visibility.&lt;/P&gt;
&lt;P&gt;If you're already on DLT, native expectations are usually the best starting point.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 24 Jul 2025 13:13:18 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/prakash-hinduja-geneva-swiss-can-i-use-tools-like-great/m-p/126373#M47681</guid>
      <dc:creator>Nir_Hedvat</dc:creator>
      <dc:date>2025-07-24T13:13:18Z</dc:date>
    </item>
  </channel>
</rss>

