<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic How is model drift calculated when the baseline table has no timestamp column? in Get Started Discussions</title>
    <link>https://community.databricks.com/t5/get-started-discussions/how-is-model-drift-calculated-when-the-baseline-table-has-no/m-p/65610#M6986</link>
    <description>&lt;P&gt;I try to understand how Databricks computes the model drift when the baseline table is available. What I understood from the documentation is Databricks processes both the primary and the baseline tables according to the specified granularities in the monitor, store this result in the profile metric table, and then use a specific measure such as KS test to compare the distribution between the values of both tables in a given window.&lt;/P&gt;&lt;P&gt;What I can't figure out is how it works if my baseline table has no timestamp. This is the only information I found in the &lt;A href="https://docs.databricks.com/en/lakehouse-monitoring/index.html" target="_self"&gt;documentation&lt;/A&gt;&amp;nbsp;which is very vague:&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;.... The exception is the timestamp column for tables used with time series or inference profiles. If columns are missing in either the primary table or the baseline table, monitoring uses best-effort heuristics to compute the output metrics&lt;/LI-CODE&gt;&lt;P&gt;&lt;BR /&gt;For example, when I use the model serving endpoint, the timestamp column of my primary table corresponds to the time when a client calls the endpoint to compute the prediction for some query. Now, imagine I want to use my validation dataset as the baseline table. How does Databricks match the rows of the two tables?&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;</description>
    <pubDate>Fri, 05 Apr 2024 13:08:28 GMT</pubDate>
    <dc:creator>MohsenJ</dc:creator>
    <dc:date>2024-04-05T13:08:28Z</dc:date>
    <item>
      <title>How is model drift calculated when the baseline table has no timestamp column?</title>
      <link>https://community.databricks.com/t5/get-started-discussions/how-is-model-drift-calculated-when-the-baseline-table-has-no/m-p/65610#M6986</link>
      <description>&lt;P&gt;I try to understand how Databricks computes the model drift when the baseline table is available. What I understood from the documentation is Databricks processes both the primary and the baseline tables according to the specified granularities in the monitor, store this result in the profile metric table, and then use a specific measure such as KS test to compare the distribution between the values of both tables in a given window.&lt;/P&gt;&lt;P&gt;What I can't figure out is how it works if my baseline table has no timestamp. This is the only information I found in the &lt;A href="https://docs.databricks.com/en/lakehouse-monitoring/index.html" target="_self"&gt;documentation&lt;/A&gt;&amp;nbsp;which is very vague:&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;.... The exception is the timestamp column for tables used with time series or inference profiles. If columns are missing in either the primary table or the baseline table, monitoring uses best-effort heuristics to compute the output metrics&lt;/LI-CODE&gt;&lt;P&gt;&lt;BR /&gt;For example, when I use the model serving endpoint, the timestamp column of my primary table corresponds to the time when a client calls the endpoint to compute the prediction for some query. Now, imagine I want to use my validation dataset as the baseline table. How does Databricks match the rows of the two tables?&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 05 Apr 2024 13:08:28 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/how-is-model-drift-calculated-when-the-baseline-table-has-no/m-p/65610#M6986</guid>
      <dc:creator>MohsenJ</dc:creator>
      <dc:date>2024-04-05T13:08:28Z</dc:date>
    </item>
  </channel>
</rss>

