<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Correlate Databricks App with Logs in Administration &amp; Architecture</title>
    <link>https://community.databricks.com/t5/administration-architecture/correlate-databricks-app-with-logs/m-p/161100#M5374</link>
    <description>&lt;P class=""&gt;&lt;SPAN&gt;I have a question about correlating Databricks system tables.&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN&gt;We are currently using the &lt;/SPAN&gt;&lt;SPAN&gt;outbound_network&lt;/SPAN&gt;&lt;SPAN&gt; and &lt;/SPAN&gt;&lt;SPAN&gt;audit&lt;/SPAN&gt;&lt;SPAN&gt; system tables. When a Databricks App makes an outbound network request, the request appears in the outbound network table with details such as the destination, &lt;/SPAN&gt;&lt;SPAN&gt;event_id&lt;/SPAN&gt;&lt;SPAN&gt;, and timestamp. However, the &lt;/SPAN&gt;&lt;SPAN&gt;network_source_type&lt;/SPAN&gt;&lt;SPAN&gt; is shown as &lt;/SPAN&gt;&lt;SPAN&gt;unknown&lt;/SPAN&gt;&lt;SPAN&gt;.&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN&gt;I have tried matching the outbound network &lt;/SPAN&gt;&lt;SPAN&gt;event_id&lt;/SPAN&gt;&lt;SPAN&gt; and timestamp with records in the audit table, but I have not found a reliable relationship between them.&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN&gt;What is the recommended way to identify which Databricks App generated a specific outbound network request?&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;For example, if we have 100 applications and several outbound calls to different destinations, how can we reliably associate each network event with the application that initiated it?&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;PS. I have checked event_id and timestamps doesn't work to identify the solution.&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Wed, 01 Jul 2026 14:58:29 GMT</pubDate>
    <dc:creator>discuss_darende</dc:creator>
    <dc:date>2026-07-01T14:58:29Z</dc:date>
    <item>
      <title>Correlate Databricks App with Logs</title>
      <link>https://community.databricks.com/t5/administration-architecture/correlate-databricks-app-with-logs/m-p/161100#M5374</link>
      <description>&lt;P class=""&gt;&lt;SPAN&gt;I have a question about correlating Databricks system tables.&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN&gt;We are currently using the &lt;/SPAN&gt;&lt;SPAN&gt;outbound_network&lt;/SPAN&gt;&lt;SPAN&gt; and &lt;/SPAN&gt;&lt;SPAN&gt;audit&lt;/SPAN&gt;&lt;SPAN&gt; system tables. When a Databricks App makes an outbound network request, the request appears in the outbound network table with details such as the destination, &lt;/SPAN&gt;&lt;SPAN&gt;event_id&lt;/SPAN&gt;&lt;SPAN&gt;, and timestamp. However, the &lt;/SPAN&gt;&lt;SPAN&gt;network_source_type&lt;/SPAN&gt;&lt;SPAN&gt; is shown as &lt;/SPAN&gt;&lt;SPAN&gt;unknown&lt;/SPAN&gt;&lt;SPAN&gt;.&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN&gt;I have tried matching the outbound network &lt;/SPAN&gt;&lt;SPAN&gt;event_id&lt;/SPAN&gt;&lt;SPAN&gt; and timestamp with records in the audit table, but I have not found a reliable relationship between them.&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN&gt;What is the recommended way to identify which Databricks App generated a specific outbound network request?&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;For example, if we have 100 applications and several outbound calls to different destinations, how can we reliably associate each network event with the application that initiated it?&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;PS. I have checked event_id and timestamps doesn't work to identify the solution.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 01 Jul 2026 14:58:29 GMT</pubDate>
      <guid>https://community.databricks.com/t5/administration-architecture/correlate-databricks-app-with-logs/m-p/161100#M5374</guid>
      <dc:creator>discuss_darende</dc:creator>
      <dc:date>2026-07-01T14:58:29Z</dc:date>
    </item>
    <item>
      <title>Re: Correlate Databricks App with Logs</title>
      <link>https://community.databricks.com/t5/administration-architecture/correlate-databricks-app-with-logs/m-p/161128#M5375</link>
      <description>&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;Hi &lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/211981"&gt;@discuss_darende&lt;/a&gt;, there's no join key today that ties an outbound_network row back to a specific app instance. network_source_type only tells you the request came from the "Apps" compute&lt;/SPAN&gt;&lt;SPAN class="s1"&gt;&lt;SPAN class="Apple-converted-space"&gt;&amp;nbsp;&lt;/SPAN&gt;category, not which of your 100 apps made it. That's also why matching on event_id or timestamp against audit isn't working, audit's app events (createApp, updateApp, the ACL change&lt;/SPAN&gt;&lt;SPAN class="s1"&gt;&lt;SPAN class="Apple-converted-space"&gt;&amp;nbsp;&lt;/SPAN&gt;events) are control plane actions. They don't fire once per outbound HTTP call, so there's nothing on that side to line up with a network event in the first place.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;A few ways to actually get there at your scale:&lt;/SPAN&gt;&lt;/P&gt;
&lt;OL&gt;
&lt;LI class="p1"&gt;&lt;SPAN class="s1"&gt;Filter by destination and time window first. Since you're presumably allowlisting domains per app already, most outbound_network rows can be mapped back to an app just by which&lt;/SPAN&gt;&amp;nbsp;&lt;SPAN&gt;destination it hit and when that app was running. With 100 apps, odds are most of them aren't calling the exact same handful of third party endpoints, so this gets you further than it&lt;/SPAN&gt;&amp;nbsp;&lt;SPAN&gt;sounds.&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI class="p1"&gt;Instrument the apps themselves. You control the app code, so this is the reliable fix. Turn on OpenTelemetry auto instrumentation for whatever framework you're running (Flask, FastAPI, Streamlit, etc.) and it'll capture outbound HTTP calls as spans. Databricks Apps can export otel_logs, otel_spans, and otel_metrics into Unity Catalog once you wire up the OTel exporter,&amp;nbsp;&lt;SPAN&gt;so you get per app, per request traces you control instead of trying to reverse engineer identity out of the network table.&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI class="p1"&gt;If you're on serverless egress with Network Connectivity Configs, you can scope a separate NCC per app so each one can only reach its own destinations. That makes the destination&amp;nbsp;&lt;SPAN&gt;itself a de facto identifier. Not a real join key, but workable if you're already set up that way.&lt;/SPAN&gt;&lt;/LI&gt;
&lt;/OL&gt;
&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;Instrumenting the apps directly (option 2) is going to be more durable option.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 01 Jul 2026 21:08:25 GMT</pubDate>
      <guid>https://community.databricks.com/t5/administration-architecture/correlate-databricks-app-with-logs/m-p/161128#M5375</guid>
      <dc:creator>iyashk-DB</dc:creator>
      <dc:date>2026-07-01T21:08:25Z</dc:date>
    </item>
  </channel>
</rss>

