<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Data leakage risk happened when we use the Azure Databricks workspace in Administration &amp; Architecture</title>
    <link>https://community.databricks.com/t5/administration-architecture/data-leakage-risk-happened-when-we-use-the-azure-databricks/m-p/97026#M2211</link>
    <description>&lt;P&gt;&lt;SPAN&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/109619"&gt;@ccsong&lt;/a&gt;&amp;nbsp;have you find out the root cause for the malicious flow entries? We are experiencing similar behavior to similar URLs. Is anyone else experiencing similar behavior that can explain the malicious flows?&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Thu, 31 Oct 2024 16:18:51 GMT</pubDate>
    <dc:creator>cuser731</dc:creator>
    <dc:date>2024-10-31T16:18:51Z</dc:date>
    <item>
      <title>Data leakage risk happened when we use the Azure Databricks workspace</title>
      <link>https://community.databricks.com/t5/administration-architecture/data-leakage-risk-happened-when-we-use-the-azure-databricks/m-p/94274#M2084</link>
      <description>&lt;P&gt;Context:&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;We are utilizing an Azure Databricks workspace for data management and model serving within our project, with delegated VNet and subnets configured specifically for this workspace. However, we are consistently observing malicious flow entries in the VNet flow logs. It appears that a background script is continuously running, sending requests to certain URLs and IP addresses. We are currently operating on the runtime version 15.4.x-cpu-ml-sca&lt;/SPAN&gt;&lt;SPAN&gt;la2.12, with no third-party libraries installed.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;The urls are like:&amp;nbsp;&lt;A href="https://chandramoulisangabathula01.github.io" target="_blank" rel="noopener"&gt;https://chandramoulisangabathula01.github.io&amp;nbsp;&amp;amp;&amp;nbsp;&lt;/A&gt;&lt;A href="http://yasse5n.github.io/EDJOSK" target="_blank" rel="noopener"&gt;http://yasse5n.github.io/EDJOSK&amp;nbsp;&amp;amp;&amp;nbsp;&lt;/A&gt;&lt;A href="https://solankisuryansh.github.io/CloneNetflix" target="_blank" rel="noopener"&gt;https://solankisuryansh.github.io/CloneNetflix&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Just screenshot one of them:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Screenshot 2024-10-16 at 18.14.45.png" style="width: 589px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/11989i807A84882AB4FC4C/image-size/large?v=v2&amp;amp;px=999" role="button" title="Screenshot 2024-10-16 at 18.14.45.png" alt="Screenshot 2024-10-16 at 18.14.45.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;The ips listed in below screenshot:&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Screenshot 2024-10-16 at 18.15.19.png" style="width: 999px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/11987i4C6473ADB2B86A02/image-size/large?v=v2&amp;amp;px=999" role="button" title="Screenshot 2024-10-16 at 18.15.19.png" alt="Screenshot 2024-10-16 at 18.15.19.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;And the requests go out from a databricks configured aclRule called "&lt;SPAN&gt;microsoft.databricks-workspaces_&lt;/SPAN&gt;&lt;SPAN&gt;useonly_databricks-worker-to-worker-outbound", the screenshot shown below:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Screenshot 2024-10-16 at 18.18.24.png" style="width: 999px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/11988iCF0BB7B996425088/image-size/large?v=v2&amp;amp;px=999" role="button" title="Screenshot 2024-10-16 at 18.18.24.png" alt="Screenshot 2024-10-16 at 18.18.24.png" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 16 Oct 2024 10:50:19 GMT</pubDate>
      <guid>https://community.databricks.com/t5/administration-architecture/data-leakage-risk-happened-when-we-use-the-azure-databricks/m-p/94274#M2084</guid>
      <dc:creator>ccsong</dc:creator>
      <dc:date>2024-10-16T10:50:19Z</dc:date>
    </item>
    <item>
      <title>Re: Data leakage risk happened when we use the Azure Databricks workspace</title>
      <link>https://community.databricks.com/t5/administration-architecture/data-leakage-risk-happened-when-we-use-the-azure-databricks/m-p/97018#M2210</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/109619"&gt;@ccsong&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;Greetings from Databricks!&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Looks like this requires a case to further investigate. Do you have an active support plan?&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Could you please submit a request?&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Please refer to:&amp;nbsp;&lt;A href="https://docs.databricks.com/en/resources/support.html" target="_blank"&gt;https://docs.databricks.com/en/resources/support.html&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 31 Oct 2024 15:51:27 GMT</pubDate>
      <guid>https://community.databricks.com/t5/administration-architecture/data-leakage-risk-happened-when-we-use-the-azure-databricks/m-p/97018#M2210</guid>
      <dc:creator>Alberto_Umana</dc:creator>
      <dc:date>2024-10-31T15:51:27Z</dc:date>
    </item>
    <item>
      <title>Re: Data leakage risk happened when we use the Azure Databricks workspace</title>
      <link>https://community.databricks.com/t5/administration-architecture/data-leakage-risk-happened-when-we-use-the-azure-databricks/m-p/97026#M2211</link>
      <description>&lt;P&gt;&lt;SPAN&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/109619"&gt;@ccsong&lt;/a&gt;&amp;nbsp;have you find out the root cause for the malicious flow entries? We are experiencing similar behavior to similar URLs. Is anyone else experiencing similar behavior that can explain the malicious flows?&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 31 Oct 2024 16:18:51 GMT</pubDate>
      <guid>https://community.databricks.com/t5/administration-architecture/data-leakage-risk-happened-when-we-use-the-azure-databricks/m-p/97026#M2211</guid>
      <dc:creator>cuser731</dc:creator>
      <dc:date>2024-10-31T16:18:51Z</dc:date>
    </item>
    <item>
      <title>Re: Data leakage risk happened when we use the Azure Databricks workspace</title>
      <link>https://community.databricks.com/t5/administration-architecture/data-leakage-risk-happened-when-we-use-the-azure-databricks/m-p/97775#M2240</link>
      <description>&lt;P class="p1"&gt;Hello everyone!&lt;/P&gt;
&lt;P class="p1"&gt;We have worked with our security team, Microsoft, and other customers who have seen similar log messages.&lt;/P&gt;
&lt;P class="p1"&gt;This log message is very misleading, as it appears to state that the malicious URI was detected within your network&amp;nbsp;—&amp;nbsp;this would be a major concern were it the case.&amp;nbsp;However, as&amp;nbsp;we’ve&amp;nbsp;learned when working with those other customers, that URI is just an example of a malicious URI that has previously been associated with that IP.&amp;nbsp;But it&amp;nbsp;wasn’t&amp;nbsp;observed within your network.&lt;/P&gt;
&lt;P class="p1"&gt;Apart from by checking with Microsoft, you can validate this because the data source for this (flow logs) operate only at layer3/4 and cannot actually contain URIs. We have also seen these alerts on connections blocked at the firewall (would never be able to request a URI) and also on encrypted connections (where the tool&amp;nbsp;wouldn’t&amp;nbsp;be able to see the URI).&lt;/P&gt;
&lt;P class="p1"&gt;The IP address in question is for github.io, so all that is actually occurring to trigger this is any connection to&amp;nbsp;&lt;A href="http://github.io/" target="_blank"&gt;github.io&lt;/A&gt;.&amp;nbsp;In practice, we have high confidence this is a call to&amp;nbsp;&lt;A href="http://nvidia.github.io/" target="_blank"&gt;nvidia.github.io&lt;/A&gt;&amp;nbsp;that is issued on some Azure Databricks systems based on Nvidia drivers.&lt;/P&gt;
&lt;P class="p1"&gt;In summary: based on conversations with Microsoft and lengthy analysis across multiple customers, this is just a very misleading log message and not an indication of any infection.&lt;/P&gt;</description>
      <pubDate>Tue, 05 Nov 2024 14:50:47 GMT</pubDate>
      <guid>https://community.databricks.com/t5/administration-architecture/data-leakage-risk-happened-when-we-use-the-azure-databricks/m-p/97775#M2240</guid>
      <dc:creator>Alberto_Umana</dc:creator>
      <dc:date>2024-11-05T14:50:47Z</dc:date>
    </item>
  </channel>
</rss>

