<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Spark readStream kafka.ssl.keystore.location abfss path in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/spark-readstream-kafka-ssl-keystore-location-abfss-path/m-p/45256#M27833</link>
    <description>&lt;P&gt;Similar to &lt;A href="https://community.databricks.com/t5/data-engineering/kafka-unable-to-read-client-keystore-jks/td-p/23301" target="_blank" rel="noopener"&gt;https://community.databricks.com/t5/data-engineering/kafka-unable-to-read-client-keystore-jks/td-p/23301 &lt;/A&gt;- the documentation (&lt;A href="https://learn.microsoft.com/en-gb/azure/databricks/structured-streaming/kafka#use-ssl-to-connect-azure-databricks-to-kafka" target="_blank" rel="noopener"&gt;https://learn.microsoft.com/en-gb/azure/databricks/structured-streaming/kafka#use-ssl-to-connect-azure-databricks-to-kafka&lt;/A&gt;) recommends that certificates for authenticating with kafka be kept in cloud storage, and the example&amp;nbsp;&lt;EM&gt;appears&lt;/EM&gt; to hint that it should be possible to read directly from that location...but in practice, it appears that spark is unable to read from abfss paths directly.&lt;BR /&gt;&lt;BR /&gt;Setting kafka.ssl.keystore.location and kafka.ssl.truststore.location to abfss paths, for me, results in:&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;DIV class=""&gt;&lt;PRE&gt;kafkashaded.org.apache.kafka.common.KafkaException: Failed to create new KafkaAdminClient&lt;BR /&gt;...&lt;BR /&gt;Caused by: kafkashaded.org.apache.kafka.common.KafkaException: Failed to load SSL keystore abfss://{container}@{account}.dfs.core.windows.net/client.keystore.p12 of type PKCS12&lt;BR /&gt;...&lt;BR /&gt;Caused by: java.nio.file.NoSuchFileException: abfss:/{container}@{account}.dfs.core.windows.net/client.keystore.p12&lt;/PRE&gt;&lt;P&gt;paths have been double-checked as correct, and the account has read permission granted to the external location.&lt;/P&gt;&lt;P&gt;Can we get confirmation that reading direct is not possible, and if the recommendation is to copy the file(s) to a local temp path first, and referencing those paths in the kafka.ssl.*.location config options? Or should it be possible to read directly from abfss paths?&lt;/P&gt;&lt;/DIV&gt;</description>
    <pubDate>Mon, 18 Sep 2023 16:29:39 GMT</pubDate>
    <dc:creator>mwoods</dc:creator>
    <dc:date>2023-09-18T16:29:39Z</dc:date>
    <item>
      <title>Spark readStream kafka.ssl.keystore.location abfss path</title>
      <link>https://community.databricks.com/t5/data-engineering/spark-readstream-kafka-ssl-keystore-location-abfss-path/m-p/45256#M27833</link>
      <description>&lt;P&gt;Similar to &lt;A href="https://community.databricks.com/t5/data-engineering/kafka-unable-to-read-client-keystore-jks/td-p/23301" target="_blank" rel="noopener"&gt;https://community.databricks.com/t5/data-engineering/kafka-unable-to-read-client-keystore-jks/td-p/23301 &lt;/A&gt;- the documentation (&lt;A href="https://learn.microsoft.com/en-gb/azure/databricks/structured-streaming/kafka#use-ssl-to-connect-azure-databricks-to-kafka" target="_blank" rel="noopener"&gt;https://learn.microsoft.com/en-gb/azure/databricks/structured-streaming/kafka#use-ssl-to-connect-azure-databricks-to-kafka&lt;/A&gt;) recommends that certificates for authenticating with kafka be kept in cloud storage, and the example&amp;nbsp;&lt;EM&gt;appears&lt;/EM&gt; to hint that it should be possible to read directly from that location...but in practice, it appears that spark is unable to read from abfss paths directly.&lt;BR /&gt;&lt;BR /&gt;Setting kafka.ssl.keystore.location and kafka.ssl.truststore.location to abfss paths, for me, results in:&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;DIV class=""&gt;&lt;PRE&gt;kafkashaded.org.apache.kafka.common.KafkaException: Failed to create new KafkaAdminClient&lt;BR /&gt;...&lt;BR /&gt;Caused by: kafkashaded.org.apache.kafka.common.KafkaException: Failed to load SSL keystore abfss://{container}@{account}.dfs.core.windows.net/client.keystore.p12 of type PKCS12&lt;BR /&gt;...&lt;BR /&gt;Caused by: java.nio.file.NoSuchFileException: abfss:/{container}@{account}.dfs.core.windows.net/client.keystore.p12&lt;/PRE&gt;&lt;P&gt;paths have been double-checked as correct, and the account has read permission granted to the external location.&lt;/P&gt;&lt;P&gt;Can we get confirmation that reading direct is not possible, and if the recommendation is to copy the file(s) to a local temp path first, and referencing those paths in the kafka.ssl.*.location config options? Or should it be possible to read directly from abfss paths?&lt;/P&gt;&lt;/DIV&gt;</description>
      <pubDate>Mon, 18 Sep 2023 16:29:39 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/spark-readstream-kafka-ssl-keystore-location-abfss-path/m-p/45256#M27833</guid>
      <dc:creator>mwoods</dc:creator>
      <dc:date>2023-09-18T16:29:39Z</dc:date>
    </item>
    <item>
      <title>Re: Spark readStream kafka.ssl.keystore.location abfss path</title>
      <link>https://community.databricks.com/t5/data-engineering/spark-readstream-kafka-ssl-keystore-location-abfss-path/m-p/45680#M27965</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/9"&gt;@Retired_mod&lt;/a&gt; - thanks for your response.&lt;/P&gt;&lt;P&gt;Not sure what's happened here...this is now working for me, so either the issue has been patched, or the issue was somehow related to my group management where the external location read permissions were mapped to a "Data Engineer" group that existed at the account level, but was not actually being properly mapped via &lt;SPAN&gt;&lt;SPAN class=""&gt;databricks_mws_permission_assignment&lt;/SPAN&gt;&lt;/SPAN&gt; to the respective workspace (which I have now rectified).&lt;/P&gt;&lt;P&gt;...I think that's probably the case, though I'm not sure in that situation why I was able to copy the file successfully via dbutils.fs.cp as a workaround until now (as that seemed to imply that I &lt;EM&gt;did&lt;/EM&gt; have access to read the file).&lt;/P&gt;</description>
      <pubDate>Fri, 22 Sep 2023 14:16:59 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/spark-readstream-kafka-ssl-keystore-location-abfss-path/m-p/45680#M27965</guid>
      <dc:creator>mwoods</dc:creator>
      <dc:date>2023-09-22T14:16:59Z</dc:date>
    </item>
    <item>
      <title>Re: Spark readStream kafka.ssl.keystore.location abfss path</title>
      <link>https://community.databricks.com/t5/data-engineering/spark-readstream-kafka-ssl-keystore-location-abfss-path/m-p/45749#M27976</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/9"&gt;@Retired_mod&lt;/a&gt;- quick update - managed to find the cause. It's neither of the above, it's a bug in the DataBricks 14.0 runtime. I had switched back to the 13.3 LTS runtime, and &lt;EM&gt;that&lt;/EM&gt; is what caused the error to disappear.&lt;/P&gt;&lt;P&gt;As soon as I try to read directly from abfss paths using a compute resource with the DBR 14.0 runtime, I get the error again, so it appears a bug has been introduced in that release.&lt;/P&gt;</description>
      <pubDate>Fri, 22 Sep 2023 19:09:59 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/spark-readstream-kafka-ssl-keystore-location-abfss-path/m-p/45749#M27976</guid>
      <dc:creator>mwoods</dc:creator>
      <dc:date>2023-09-22T19:09:59Z</dc:date>
    </item>
  </channel>
</rss>

