<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Databricks Expectations in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/databricks-expectations/m-p/70246#M34037</link>
    <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/88135"&gt;@brockb&lt;/a&gt;,&lt;/P&gt;&lt;P&gt;I wanted to allow each failed record to have a "reason" for being rejected/failed. Is this the best way for me to capture the "reason"?&lt;/P&gt;</description>
    <pubDate>Wed, 22 May 2024 11:16:55 GMT</pubDate>
    <dc:creator>youcanlearn</dc:creator>
    <dc:date>2024-05-22T11:16:55Z</dc:date>
    <item>
      <title>Databricks Expectations</title>
      <link>https://community.databricks.com/t5/data-engineering/databricks-expectations/m-p/69003#M33782</link>
      <description>&lt;P&gt;In the example in &lt;A href="https://docs.databricks.com/en/delta-live-tables/expectations.html#fail-on-invalid-records," target="_blank"&gt;https://docs.databricks.com/en/delta-live-tables/expectations.html#fail-on-invalid-records,&lt;/A&gt;&amp;nbsp;it wrote that one is able to query the DLT event log for such expectations violation.&amp;nbsp;&lt;/P&gt;&lt;P&gt;In Databricks, I can use expectation to fail or drop records, but how do I capture the reasons (expectations violated) for each of the record dropped/failed?&lt;/P&gt;&lt;LI-CODE lang="javascript"&gt;Expectation Violated:
{
  "flowName": "a-b",
  "verboseInfo": {
    "expectationsViolated": [
      "x1 is negative"
    ],
    "inputData": {
      "a": {"x1": 1,"y1": "a },
      "b": {
        "x2": 1,
        "y2": "aa"
      }
    },
    "outputRecord": {
      "x1": 1,
      "y1": "a",
      "x2": 1,
      "y2": "aa"
    },
    "missingInputData": false
  }
}&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 14 May 2024 14:27:58 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/databricks-expectations/m-p/69003#M33782</guid>
      <dc:creator>youcanlearn</dc:creator>
      <dc:date>2024-05-14T14:27:58Z</dc:date>
    </item>
    <item>
      <title>Re: Databricks Expectations</title>
      <link>https://community.databricks.com/t5/data-engineering/databricks-expectations/m-p/69209#M33858</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/105241"&gt;@youcanlearn&lt;/a&gt;,&lt;BR /&gt;&lt;BR /&gt;This information would be written to `log4j.txt` as part of a stack trace when the expectation is created with one of the `fail` expectation operators (e.g. `expect_or_fail`). When a failure occurs, you would see a `Caused by` log message such as:&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;LI-CODE lang="javascript"&gt;Caused by: java.lang.RuntimeException: Expectation violated: {"flowName":"dlt_autoloader_csv_test","verboseInfo":{"expectationsViolated":["valid_max_length"],"inputData":{},"outputRecord":{"col1":"12345678901234567890123456789","col2":"two","_rescued_data":null},"missingInputData":false}}&lt;/LI-CODE&gt;
&lt;P&gt;...which contains a JSON payload such as the one referenced in the docs you linked to.&lt;BR /&gt;&lt;BR /&gt;Additionally, you could find the stack trace with the same messaging in the Event Log within the `error.exceptions` array.&lt;BR /&gt;&lt;BR /&gt;Hope this helps.&lt;/P&gt;</description>
      <pubDate>Fri, 17 May 2024 02:25:18 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/databricks-expectations/m-p/69209#M33858</guid>
      <dc:creator>brockb</dc:creator>
      <dc:date>2024-05-17T02:25:18Z</dc:date>
    </item>
    <item>
      <title>Re: Databricks Expectations</title>
      <link>https://community.databricks.com/t5/data-engineering/databricks-expectations/m-p/70246#M34037</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/88135"&gt;@brockb&lt;/a&gt;,&lt;/P&gt;&lt;P&gt;I wanted to allow each failed record to have a "reason" for being rejected/failed. Is this the best way for me to capture the "reason"?&lt;/P&gt;</description>
      <pubDate>Wed, 22 May 2024 11:16:55 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/databricks-expectations/m-p/70246#M34037</guid>
      <dc:creator>youcanlearn</dc:creator>
      <dc:date>2024-05-22T11:16:55Z</dc:date>
    </item>
    <item>
      <title>Re: Databricks Expectations</title>
      <link>https://community.databricks.com/t5/data-engineering/databricks-expectations/m-p/70272#M34043</link>
      <description>&lt;P&gt;That's right, the "reason" would be&amp;nbsp;"x1 is negative" in your example and "valid_max_length" in the example JSON payload that I shared.&lt;BR /&gt;&lt;BR /&gt;If you are looking for a descriptive reason, you would name the expectation accordingly such as:&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;LI-CODE lang="python"&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/97035"&gt;@Dlt&lt;/a&gt;.expect_or_fail("this expectation will fail because of reason1 and reason2", "count &amp;gt; 0")&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 22 May 2024 13:57:59 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/databricks-expectations/m-p/70272#M34043</guid>
      <dc:creator>brockb</dc:creator>
      <dc:date>2024-05-22T13:57:59Z</dc:date>
    </item>
  </channel>
</rss>

