<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Integrating Azure Log Analytics with Delta Live Tables Pipelines and Job Clusters in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/integrating-azure-log-analytics-with-delta-live-tables-pipelines/m-p/102273#M41049</link>
    <description>&lt;P class="_1t7bu9h1 paragraph"&gt;&lt;SPAN&gt;To address your questions about setting up a Delta Live Tables (DLT) pipeline for your medallion architecture and integrating it with Azure Log Analytics, here are the detailed steps and best practices:&lt;/SPAN&gt;&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;
&lt;P class="_1t7bu9h1 paragraph"&gt;&lt;SPAN&gt;&lt;STRONG&gt;Connecting Azure Log Analytics via Azure Key Vault for Secure Access:&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;SPAN&gt;Yes, it is possible to connect Azure Log Analytics via Azure Key Vault for secure access. Azure Key Vault can securely store and manage access to secrets, such as connection strings and API keys, which can be used by your Databricks environment. You can configure Azure Key Vault to store the necessary credentials and then access these secrets from your Databricks notebooks or jobs using the Databricks Secrets API.&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P class="_1t7bu9h1 paragraph"&gt;&lt;SPAN&gt;&lt;STRONG&gt;Handling Configuration on Job Clusters for DLT Pipelines:&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/P&gt;
Since DLT pipelines run on job clusters, you need to ensure that the job clusters have the necessary configurations to access Azure Log Analytics. Here are the steps:
&lt;UL class="_1t7bu9h7 _1t7bu9h2"&gt;
&lt;LI&gt;&lt;SPAN&gt;&lt;STRONG&gt;Create and Configure Azure Key Vault:&lt;/STRONG&gt; Store your Azure Log Analytics workspace ID and primary key in Azure Key Vault.&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;SPAN&gt;&lt;STRONG&gt;Set Up Databricks Secrets:&lt;/STRONG&gt; Use the Databricks CLI or UI to create a secret scope and add the secrets from Azure Key Vault.&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;SPAN&gt;&lt;STRONG&gt;Access Secrets in Your DLT Pipeline:&lt;/STRONG&gt; In your DLT pipeline notebooks, use the &lt;CODE&gt;dbutils.secrets.get&lt;/CODE&gt; function to retrieve the secrets and configure the logging.&lt;/SPAN&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P class="_1t7bu9h1 paragraph"&gt;&lt;SPAN&gt;&lt;STRONG&gt;Ingesting and Configuring Log Files for Azure Log Analytics:&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;SPAN&gt;The logging process for DLT pipelines involves capturing logs from the &lt;CODE&gt;_delta_log&lt;/CODE&gt; files, which are part of the Delta Lake transaction log. However, for integration with Azure Log Analytics, you can use the following approach:&lt;/SPAN&gt;
&lt;UL class="_1t7bu9h7 _1t7bu9h2"&gt;
&lt;LI&gt;&lt;STRONG&gt;Enable Cluster Logging:&lt;/STRONG&gt; Ensure that cluster logging is enabled to capture logs and metrics.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Use Azure Monitor:&lt;/STRONG&gt; Configure Azure Monitor to collect logs from your Databricks clusters and send them to Azure Log Analytics. This can be done by setting up diagnostic settings in Azure Monitor to route logs to your Log Analytics workspace.&lt;/LI&gt;
&lt;/UL&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P class="_1t7bu9h1 paragraph"&gt;&lt;SPAN&gt;&lt;STRONG&gt;Best Practices:&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;UL class="_1t7bu9h7 _1t7bu9h2"&gt;
&lt;LI&gt;&lt;SPAN&gt;&lt;STRONG&gt;Secure Access:&lt;/STRONG&gt; Always use Azure Key Vault to manage and access secrets securely.&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;SPAN&gt;&lt;STRONG&gt;Monitor and Audit:&lt;/STRONG&gt; Use Azure Monitor and Log Analytics to continuously monitor and audit your DLT pipelines.&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;SPAN&gt;&lt;STRONG&gt;Data Quality and Lineage:&lt;/STRONG&gt; Utilize the event log schema provided by Databricks to track data quality metrics and lineage information for your DLT pipelines&lt;/SPAN&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;/LI&gt;
&lt;/OL&gt;</description>
    <pubDate>Mon, 16 Dec 2024 16:44:59 GMT</pubDate>
    <dc:creator>Walter_C</dc:creator>
    <dc:date>2024-12-16T16:44:59Z</dc:date>
    <item>
      <title>Integrating Azure Log Analytics with Delta Live Tables Pipelines and Job Clusters</title>
      <link>https://community.databricks.com/t5/data-engineering/integrating-azure-log-analytics-with-delta-live-tables-pipelines/m-p/102270#M41047</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I'm setting up a Delta Live Tables (DLT) pipeline for my medallion architecture. I’m interested in tracking, ingesting, and analyzing the log files in Azure Log Analytics. However, I haven’t found much information on how to configure this setup.&lt;/P&gt;&lt;P&gt;Specifically, I have the following questions:&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;Is it possible to connect Azure Log Analytics via an Azure Key Vault for secure access?&lt;/LI&gt;&lt;LI&gt;Since DLT pipelines run on job clusters instead of regular clusters (as described in earlier documentation), how should I handle this configuration? On the current job cluster used for my DLT pipeline, log file destinations are not enabled.&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;Additionally, does the logging process involve the _delta_log files, or is there another recommended way to ingest and configure log files for Azure Log Analytics?&lt;/P&gt;&lt;P&gt;Any guidance or best practices on this integration would be greatly appreciated!&lt;/P&gt;&lt;P&gt;Thank you!&lt;/P&gt;</description>
      <pubDate>Mon, 16 Dec 2024 15:31:09 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/integrating-azure-log-analytics-with-delta-live-tables-pipelines/m-p/102270#M41047</guid>
      <dc:creator>mkEngineer</dc:creator>
      <dc:date>2024-12-16T15:31:09Z</dc:date>
    </item>
    <item>
      <title>Re: Integrating Azure Log Analytics with Delta Live Tables Pipelines and Job Clusters</title>
      <link>https://community.databricks.com/t5/data-engineering/integrating-azure-log-analytics-with-delta-live-tables-pipelines/m-p/102273#M41049</link>
      <description>&lt;P class="_1t7bu9h1 paragraph"&gt;&lt;SPAN&gt;To address your questions about setting up a Delta Live Tables (DLT) pipeline for your medallion architecture and integrating it with Azure Log Analytics, here are the detailed steps and best practices:&lt;/SPAN&gt;&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;
&lt;P class="_1t7bu9h1 paragraph"&gt;&lt;SPAN&gt;&lt;STRONG&gt;Connecting Azure Log Analytics via Azure Key Vault for Secure Access:&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;SPAN&gt;Yes, it is possible to connect Azure Log Analytics via Azure Key Vault for secure access. Azure Key Vault can securely store and manage access to secrets, such as connection strings and API keys, which can be used by your Databricks environment. You can configure Azure Key Vault to store the necessary credentials and then access these secrets from your Databricks notebooks or jobs using the Databricks Secrets API.&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P class="_1t7bu9h1 paragraph"&gt;&lt;SPAN&gt;&lt;STRONG&gt;Handling Configuration on Job Clusters for DLT Pipelines:&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/P&gt;
Since DLT pipelines run on job clusters, you need to ensure that the job clusters have the necessary configurations to access Azure Log Analytics. Here are the steps:
&lt;UL class="_1t7bu9h7 _1t7bu9h2"&gt;
&lt;LI&gt;&lt;SPAN&gt;&lt;STRONG&gt;Create and Configure Azure Key Vault:&lt;/STRONG&gt; Store your Azure Log Analytics workspace ID and primary key in Azure Key Vault.&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;SPAN&gt;&lt;STRONG&gt;Set Up Databricks Secrets:&lt;/STRONG&gt; Use the Databricks CLI or UI to create a secret scope and add the secrets from Azure Key Vault.&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;SPAN&gt;&lt;STRONG&gt;Access Secrets in Your DLT Pipeline:&lt;/STRONG&gt; In your DLT pipeline notebooks, use the &lt;CODE&gt;dbutils.secrets.get&lt;/CODE&gt; function to retrieve the secrets and configure the logging.&lt;/SPAN&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P class="_1t7bu9h1 paragraph"&gt;&lt;SPAN&gt;&lt;STRONG&gt;Ingesting and Configuring Log Files for Azure Log Analytics:&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;SPAN&gt;The logging process for DLT pipelines involves capturing logs from the &lt;CODE&gt;_delta_log&lt;/CODE&gt; files, which are part of the Delta Lake transaction log. However, for integration with Azure Log Analytics, you can use the following approach:&lt;/SPAN&gt;
&lt;UL class="_1t7bu9h7 _1t7bu9h2"&gt;
&lt;LI&gt;&lt;STRONG&gt;Enable Cluster Logging:&lt;/STRONG&gt; Ensure that cluster logging is enabled to capture logs and metrics.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Use Azure Monitor:&lt;/STRONG&gt; Configure Azure Monitor to collect logs from your Databricks clusters and send them to Azure Log Analytics. This can be done by setting up diagnostic settings in Azure Monitor to route logs to your Log Analytics workspace.&lt;/LI&gt;
&lt;/UL&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P class="_1t7bu9h1 paragraph"&gt;&lt;SPAN&gt;&lt;STRONG&gt;Best Practices:&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;UL class="_1t7bu9h7 _1t7bu9h2"&gt;
&lt;LI&gt;&lt;SPAN&gt;&lt;STRONG&gt;Secure Access:&lt;/STRONG&gt; Always use Azure Key Vault to manage and access secrets securely.&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;SPAN&gt;&lt;STRONG&gt;Monitor and Audit:&lt;/STRONG&gt; Use Azure Monitor and Log Analytics to continuously monitor and audit your DLT pipelines.&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;SPAN&gt;&lt;STRONG&gt;Data Quality and Lineage:&lt;/STRONG&gt; Utilize the event log schema provided by Databricks to track data quality metrics and lineage information for your DLT pipelines&lt;/SPAN&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;/LI&gt;
&lt;/OL&gt;</description>
      <pubDate>Mon, 16 Dec 2024 16:44:59 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/integrating-azure-log-analytics-with-delta-live-tables-pipelines/m-p/102273#M41049</guid>
      <dc:creator>Walter_C</dc:creator>
      <dc:date>2024-12-16T16:44:59Z</dc:date>
    </item>
    <item>
      <title>Re: Integrating Azure Log Analytics with Delta Live Tables Pipelines and Job Clusters</title>
      <link>https://community.databricks.com/t5/data-engineering/integrating-azure-log-analytics-with-delta-live-tables-pipelines/m-p/102401#M41093</link>
      <description>&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;DIV class=""&gt;&lt;SPAN class=""&gt;&lt;SPAN class=""&gt;"message": &lt;SPAN class=""&gt;" File &amp;lt;command-68719476741&amp;gt;, line 10&lt;SPAN class=""&gt;\n log_analytics_pkey = dbutils.secrets.get(scope=&lt;SPAN class=""&gt;\"ScopeLogAnalyticsPKey&lt;SPAN class=""&gt;\", key=&lt;SPAN class=""&gt;\"LogAnalyticsPKey&lt;SPAN class=""&gt;\")&lt;SPAN class=""&gt;\n ^&lt;SPAN class=""&gt;\nSyntaxError: invalid syntax&lt;SPAN class=""&gt;\n", &lt;SPAN class=""&gt;"error_class": &lt;SPAN class=""&gt;"_UNCLASSIFIED_PYTHON_COMMAND_ERROR"&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;PRE&gt;&amp;nbsp;&lt;/PRE&gt;&lt;P&gt;It seems odd that this configuration has to be handled at the command-line level. Could you guide me further on how to set up this configuration, given that it doesn’t work in the notebook? Specifically, is there a way to configure the secrets directly in the &lt;STRONG&gt;JSON settings or the &lt;STRONG&gt;DLT UI Advanced Configuration?&lt;/STRONG&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class=""&gt;"message": &lt;SPAN class=""&gt;" File &amp;lt;command-68719476741&amp;gt;, line 10&lt;SPAN class=""&gt;\n log_analytics_pkey = dbutils.secrets.get(scope=&lt;SPAN class=""&gt;\"ScopeLogAnalyticsPKey&lt;SPAN class=""&gt;\", key=&lt;SPAN class=""&gt;\"LogAnalyticsPKey&lt;SPAN class=""&gt;\")&lt;SPAN class=""&gt;\n ^&lt;SPAN class=""&gt;\nSyntaxError: invalid syntax&lt;SPAN class=""&gt;\n", &lt;SPAN class=""&gt;"error_class": &lt;SPAN class=""&gt;"_UNCLASSIFIED_PYTHON_COMMAND_ERROR"&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;For example, could I pass my two secrets (Log Analytics Workspace ID and Log Analytics Primary Key, stored in Key Vault) as key-value pairs under &lt;STRONG&gt;Advanced Configuration? How does the Scope&amp;nbsp; I jsuit created com into play here? Or is that section only for secrets created in the CLI’s secret scope?&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;Simply put, can I use the &lt;STRONG&gt;Advanced Configuration (Key-Value pairs) to set these secrets and avoid reliance on code-based retrieval?&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;On another note, how can I verify that cluster logging is enabled? Besides checking the &lt;STRONG&gt;Logs and &lt;STRONG&gt;Metrics sections in the DLT Pipeline UI under &lt;STRONG&gt;Compute/Clusters, is there another way to ensure that logging and metrics are correctly captured? Those tabs a re enbaled but that is not enough for confirming enablement.&amp;nbsp;&lt;/STRONG&gt;&lt;/STRONG&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;Also, I noticed in the &lt;STRONG&gt;Compute page under &lt;STRONG&gt;Advanced Options the setting:&lt;BR /&gt;&lt;EM&gt;"When a user runs a command on a cluster with Credential Passthrough enabled, that user's Azure Active Directory credentials will be automatically passed through to Spark, allowing them to access data in Azure Data Lake Storage Gen1 and Gen2 without having to manually specify their credentials."&lt;/EM&gt;&lt;/STRONG&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;However, I’m unable to change the destination from "None." Could this be related to Unity Catalog being enabled? If so, does Unity Catalog impose restrictions on credential passthrough or how secrets are managed with Azure Key Vault?&lt;/P&gt;&lt;P&gt;Lastly, when running the notebooks for DLT, I noticed that a fourth tab briefly appears at the bottom of the page (next to DLT Graph, DLT Event Log, and DLT Query History) called &lt;STRONG&gt;Pipeline Logs, but it disappears after about a second.&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;I suspect my Azure Monitor setup is mostly correct, but it seems like the logs are not being routed to the Log Analytics Workspace. Can you confirm if the route for logs should be explicitly set elsewhere, or if there’s an issue with the configuration itself?&lt;/P&gt;&lt;P&gt;Thanks again for your help!&lt;/P&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Tue, 17 Dec 2024 14:57:46 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/integrating-azure-log-analytics-with-delta-live-tables-pipelines/m-p/102401#M41093</guid>
      <dc:creator>mkEngineer</dc:creator>
      <dc:date>2024-12-17T14:57:46Z</dc:date>
    </item>
  </channel>
</rss>

