<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic How should I correctly extract the full table name from request_params in audit logs? in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/how-should-i-correctly-extract-the-full-table-name-from-request/m-p/148659#M52942</link>
    <description>&lt;P&gt;I’m trying to build a UC usage/refresh tracking table for every workspace. For each workspace, I want to know how many times a UC table was refreshed or accessed each month. To do this, I’m reading the Databricks audit logs and I need to extract only the correct full table name from the request_params field.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Sample Output Table (What I’m Trying to Build):&lt;/STRONG&gt;&lt;/P&gt;&lt;DIV&gt;&lt;P&gt;Just to show what I’m aiming for:&lt;/P&gt;&lt;DIV&gt;&amp;nbsp; &lt;TABLE border="1" width="100%"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD width="20%" height="57px"&gt;&lt;STRONG&gt;workspace_id&lt;/STRONG&gt;&lt;/TD&gt;&lt;TD width="20%" height="57px"&gt;&lt;STRONG&gt;workspace_name&lt;/STRONG&gt;&lt;/TD&gt;&lt;TD width="20%" height="57px"&gt;&lt;STRONG&gt;month (event date)&lt;/STRONG&gt;&lt;/TD&gt;&lt;TD width="20%" height="57px"&gt;&lt;STRONG&gt;full_table_name&lt;/STRONG&gt;&lt;/TD&gt;&lt;TD width="20%" height="57px"&gt;&lt;STRONG&gt;Refresh_count (count of distinct event date)&lt;/STRONG&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD height="30px"&gt;&lt;DIV&gt;12345&lt;/DIV&gt;&lt;/TD&gt;&lt;TD height="30px"&gt;&lt;DIV&gt;ws-prod&lt;/DIV&gt;&lt;/TD&gt;&lt;TD height="30px"&gt;&lt;DIV&gt;2026-01-01&lt;/DIV&gt;&lt;/TD&gt;&lt;TD height="30px"&gt;&lt;DIV&gt;main.sales.orders&lt;/DIV&gt;&lt;/TD&gt;&lt;TD height="30px"&gt;14&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD height="30px"&gt;&lt;DIV&gt;12345&lt;/DIV&gt;&lt;/TD&gt;&lt;TD height="30px"&gt;&lt;DIV&gt;ws-prod&lt;/DIV&gt;&lt;/TD&gt;&lt;TD height="30px"&gt;&lt;DIV&gt;2026-01-01&lt;/DIV&gt;&lt;/TD&gt;&lt;TD height="30px"&gt;&lt;DIV&gt;main.marketing.accounts&lt;/DIV&gt;&lt;/TD&gt;&lt;TD height="30px"&gt;6&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD height="30px"&gt;&lt;DIV&gt;67890&lt;/DIV&gt;&lt;/TD&gt;&lt;TD height="30px"&gt;&lt;DIV&gt;ws-dev&lt;/DIV&gt;&lt;/TD&gt;&lt;TD height="30px"&gt;&lt;DIV&gt;2026-01-01&lt;/DIV&gt;&lt;/TD&gt;&lt;TD height="30px"&gt;&lt;DIV&gt;main.hr.employee_master&lt;/DIV&gt;&lt;/TD&gt;&lt;TD height="30px"&gt;10&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;DIV&gt;&lt;P&gt;While parsing request_params, I’m seeing multiple different structures depending on the event type:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;table_full_name&lt;/LI&gt;&lt;LI&gt;full_name_arg&lt;/LI&gt;&lt;LI&gt;securables (array with type and full_name)&lt;/LI&gt;&lt;LI&gt;separate fields like catalog, schema, and table&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;Because of this inconsistency, I’m not sure which one is the correct or recommended field to extract the full table name from. My goal is to capture a clean and reliable &amp;lt;catalog&amp;gt;.&amp;lt;schema&amp;gt;.&amp;lt;table&amp;gt; value for each refresh/access event.&lt;/P&gt;&lt;H3&gt;&lt;STRONG&gt;My question:&lt;/STRONG&gt;&lt;/H3&gt;&lt;P&gt;What is the right or best‑practice approach to handle these different structures in request_params and determine which field should be treated as the authoritative full table name?&lt;/P&gt;&lt;P&gt;I want to avoid extracting the wrong name or parsing unnecessary fields, and I’d like to follow whatever Databricks considers the standard going forward.&lt;/P&gt;&lt;P&gt;Any guidance or recommendations would be appreciated.&lt;/P&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
    <pubDate>Wed, 18 Feb 2026 04:57:20 GMT</pubDate>
    <dc:creator>Danish11052000</dc:creator>
    <dc:date>2026-02-18T04:57:20Z</dc:date>
    <item>
      <title>How should I correctly extract the full table name from request_params in audit logs?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-should-i-correctly-extract-the-full-table-name-from-request/m-p/148659#M52942</link>
      <description>&lt;P&gt;I’m trying to build a UC usage/refresh tracking table for every workspace. For each workspace, I want to know how many times a UC table was refreshed or accessed each month. To do this, I’m reading the Databricks audit logs and I need to extract only the correct full table name from the request_params field.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Sample Output Table (What I’m Trying to Build):&lt;/STRONG&gt;&lt;/P&gt;&lt;DIV&gt;&lt;P&gt;Just to show what I’m aiming for:&lt;/P&gt;&lt;DIV&gt;&amp;nbsp; &lt;TABLE border="1" width="100%"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD width="20%" height="57px"&gt;&lt;STRONG&gt;workspace_id&lt;/STRONG&gt;&lt;/TD&gt;&lt;TD width="20%" height="57px"&gt;&lt;STRONG&gt;workspace_name&lt;/STRONG&gt;&lt;/TD&gt;&lt;TD width="20%" height="57px"&gt;&lt;STRONG&gt;month (event date)&lt;/STRONG&gt;&lt;/TD&gt;&lt;TD width="20%" height="57px"&gt;&lt;STRONG&gt;full_table_name&lt;/STRONG&gt;&lt;/TD&gt;&lt;TD width="20%" height="57px"&gt;&lt;STRONG&gt;Refresh_count (count of distinct event date)&lt;/STRONG&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD height="30px"&gt;&lt;DIV&gt;12345&lt;/DIV&gt;&lt;/TD&gt;&lt;TD height="30px"&gt;&lt;DIV&gt;ws-prod&lt;/DIV&gt;&lt;/TD&gt;&lt;TD height="30px"&gt;&lt;DIV&gt;2026-01-01&lt;/DIV&gt;&lt;/TD&gt;&lt;TD height="30px"&gt;&lt;DIV&gt;main.sales.orders&lt;/DIV&gt;&lt;/TD&gt;&lt;TD height="30px"&gt;14&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD height="30px"&gt;&lt;DIV&gt;12345&lt;/DIV&gt;&lt;/TD&gt;&lt;TD height="30px"&gt;&lt;DIV&gt;ws-prod&lt;/DIV&gt;&lt;/TD&gt;&lt;TD height="30px"&gt;&lt;DIV&gt;2026-01-01&lt;/DIV&gt;&lt;/TD&gt;&lt;TD height="30px"&gt;&lt;DIV&gt;main.marketing.accounts&lt;/DIV&gt;&lt;/TD&gt;&lt;TD height="30px"&gt;6&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD height="30px"&gt;&lt;DIV&gt;67890&lt;/DIV&gt;&lt;/TD&gt;&lt;TD height="30px"&gt;&lt;DIV&gt;ws-dev&lt;/DIV&gt;&lt;/TD&gt;&lt;TD height="30px"&gt;&lt;DIV&gt;2026-01-01&lt;/DIV&gt;&lt;/TD&gt;&lt;TD height="30px"&gt;&lt;DIV&gt;main.hr.employee_master&lt;/DIV&gt;&lt;/TD&gt;&lt;TD height="30px"&gt;10&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;DIV&gt;&lt;P&gt;While parsing request_params, I’m seeing multiple different structures depending on the event type:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;table_full_name&lt;/LI&gt;&lt;LI&gt;full_name_arg&lt;/LI&gt;&lt;LI&gt;securables (array with type and full_name)&lt;/LI&gt;&lt;LI&gt;separate fields like catalog, schema, and table&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;Because of this inconsistency, I’m not sure which one is the correct or recommended field to extract the full table name from. My goal is to capture a clean and reliable &amp;lt;catalog&amp;gt;.&amp;lt;schema&amp;gt;.&amp;lt;table&amp;gt; value for each refresh/access event.&lt;/P&gt;&lt;H3&gt;&lt;STRONG&gt;My question:&lt;/STRONG&gt;&lt;/H3&gt;&lt;P&gt;What is the right or best‑practice approach to handle these different structures in request_params and determine which field should be treated as the authoritative full table name?&lt;/P&gt;&lt;P&gt;I want to avoid extracting the wrong name or parsing unnecessary fields, and I’d like to follow whatever Databricks considers the standard going forward.&lt;/P&gt;&lt;P&gt;Any guidance or recommendations would be appreciated.&lt;/P&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Wed, 18 Feb 2026 04:57:20 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-should-i-correctly-extract-the-full-table-name-from-request/m-p/148659#M52942</guid>
      <dc:creator>Danish11052000</dc:creator>
      <dc:date>2026-02-18T04:57:20Z</dc:date>
    </item>
    <item>
      <title>Re: How should I correctly extract the full table name from request_params in audit logs?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-should-i-correctly-extract-the-full-table-name-from-request/m-p/149487#M53108</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/188867"&gt;@Danish11052000&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;Is there a reason you prefer building your own table for this? I'm asking because there are simpler and more reliable patterns than hand-parsing.&lt;/P&gt;
&lt;P class="p8i6j01 paragraph"&gt;If the account has system tables enabled, you can query &lt;CODE class="p8i6j0f"&gt;system.access.audit&lt;/CODE&gt; directly instead of ingesting raw logs to your own table. It already exposes:&lt;/P&gt;
&lt;UL class="p8i6j07 p8i6j02"&gt;
&lt;LI class="p8i6j0a"&gt;&lt;STRONG&gt;&lt;CODE class="p8i6j0f"&gt;workspace_id&lt;/CODE&gt;&lt;/STRONG&gt; – the workspace that triggered the event.&lt;/LI&gt;
&lt;LI class="p8i6j0a"&gt;&lt;STRONG&gt;&lt;CODE class="p8i6j0f"&gt;service_name&lt;/CODE&gt;, &lt;CODE class="p8i6j0f"&gt;action_name&lt;/CODE&gt;, &lt;CODE class="p8i6j0f"&gt;event_time&lt;/CODE&gt;, &lt;CODE class="p8i6j0f"&gt;event_date&lt;/CODE&gt;&lt;/STRONG&gt;&amp;nbsp;- for filtering by UC events and time.&lt;/LI&gt;
&lt;LI class="p8i6j0a"&gt;&lt;STRONG&gt;&lt;CODE class="p8i6j0f"&gt;request_params&lt;/CODE&gt; (MAP&amp;lt;STRING,STRING&amp;gt;)&lt;/STRONG&gt;&amp;nbsp;- all the RPC parameters you are currently parsing.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;&lt;SPAN&gt;System tables are available in the &lt;/SPAN&gt;&lt;CODE class="p8i6j0f"&gt;system&lt;/CODE&gt;&lt;SPAN&gt; catalog and include both &lt;/SPAN&gt;&lt;STRONG style="color: #1b3139; font-family: inherit;"&gt;&lt;CODE class="p8i6j0f"&gt;system.access.audit&lt;/CODE&gt;&lt;/STRONG&gt;&lt;SPAN&gt; (audit logs) and &lt;/SPAN&gt;&lt;STRONG style="color: #1b3139; font-family: inherit;"&gt;&lt;CODE class="p8i6j0f"&gt;system.access.table_lineage&lt;/CODE&gt;&lt;/STRONG&gt;&lt;SPAN&gt; (per-table read/write events).&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class="p8i6j01 paragraph"&gt;If you can use these:&lt;/P&gt;
&lt;UL class="p8i6j07 p8i6j02"&gt;
&lt;LI class="p8i6j0a"&gt;You don’t need to maintain your own ingestion pipeline.&lt;/LI&gt;
&lt;LI class="p8i6j0a"&gt;You get 365 days of history out of the box (for audit + lineage).&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;You can still layer your monthly UC usage/refresh metrics&lt;SPAN&gt; on top with a simple SQL view or table.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;As you pointed out, the challenge is that multiple keys are populated in the request_parms. You can try coalescing all of this into a single value. Example given below.&lt;/SPAN&gt;&lt;/P&gt;
&lt;LI-CODE lang="markup"&gt;-- in system.access.audit
COALESCE(
  request_params.table_full_name,
  request_params.full_name_arg,
  request_params.securable_full_name,
  CONCAT(
    request_params.catalog_name, '.', 
    request_params.schema_name, '.', 
    request_params.name
  ),
  FROM_JSON(
    request_params.securables,
    'ARRAY&amp;lt;STRUCT&amp;lt;type: STRING, full_name: STRING&amp;gt;&amp;gt;'
  )[0].full_name
) AS table_full_name&lt;/LI-CODE&gt;
&lt;P class="p8i6j01 paragraph"&gt;So instead of “extracting the correct table name” differently per &lt;CODE class="p8i6j0f"&gt;action_name&lt;/CODE&gt;, you can:&lt;/P&gt;
&lt;OL class="p8i6j02"&gt;
&lt;LI class="p8i6j0a"&gt;Put this exact &lt;CODE class="p8i6j0f"&gt;COALESCE&lt;/CODE&gt; expression in a &lt;STRONG&gt;view&lt;/STRONG&gt; over &lt;CODE class="p8i6j0f"&gt;system.access.audit&lt;/CODE&gt; (or your own logs table).&lt;/LI&gt;
&lt;LI class="p8i6j0a"&gt;Always reference &lt;CODE class="p8i6j0f"&gt;table_full_name&lt;/CODE&gt; from that view.&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;Hope this helps!&lt;/P&gt;
&lt;P class="p1"&gt;&lt;FONT size="3" color="#FF6600"&gt;&lt;STRONG&gt;&lt;I&gt;If this answer resolves your question, could you mark it as “Accept as Solution”? That helps other users quickly find the correct fix.&lt;/I&gt;&lt;/STRONG&gt;&lt;/FONT&gt;&lt;I&gt;&lt;/I&gt;&lt;/P&gt;
&lt;P&gt;Regards,&lt;/P&gt;</description>
      <pubDate>Fri, 27 Feb 2026 09:18:30 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-should-i-correctly-extract-the-full-table-name-from-request/m-p/149487#M53108</guid>
      <dc:creator>Ashwin_DSA</dc:creator>
      <dc:date>2026-02-27T09:18:30Z</dc:date>
    </item>
  </channel>
</rss>

