<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Spark Kafka Client Not Using Certs from Default truststore in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/spark-kafka-client-not-using-certs-from-default-truststore/m-p/120330#M46141</link>
    <description>&lt;P&gt;Hi Team,&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;I'm working on connecting Databricks to an external Kafka cluster secured with SASL_SSL (SCRAM-SHA-512 + certificate trust). We've encountered an issue where certificates imported into the default JVM truststore (cacerts) via an init script are not picked up by Spark’s Kafka connector, unless we explicitly create and reference a .jks truststore.&lt;BR /&gt;What We've Done:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;for N in $(seq 0 $((CERTS - 1))); do
  ALIAS="custom-cert-$N"
  awk "n==$N{print} /END CERTIFICATE/{n++}" "$PEM_FILE" | \
    keytool -noprompt -import -trustcacerts \
    -alias "$ALIAS" -keystore "$KEYSTORE" -storepass "$PASSWORD"
done&lt;/LI-CODE&gt;&lt;P&gt;This correctly added entries like custom-cert-0 and custom-cert-1 to the cacerts store. We verified them using keytool -list.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Despite this, .write.format("kafka") failed with SSL handshake errors until we disabled hostname verification with:&lt;/P&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;/DIV&gt;&lt;/DIV&gt;&lt;LI-CODE lang="markup"&gt;.option("kafka.ssl.endpoint.identification.algorithm", "")&lt;/LI-CODE&gt;&lt;P&gt;and we could push messages to kafka topic and read them from databricks notebook but not securely. NOTICE we also used abfss and volume, and both could not resolve our problem.&amp;nbsp;We were able to push and read messages from Kafka this way, but it’s not fully secure. We tried both abfss:// and /mnt volume mounting approaches, but neither allowed us to reliably use the .jks truststore. Since we're also using Unity Catalog, we're restricted from referencing paths under /dbfs, in the code directly which further limits our options.&lt;/P&gt;&lt;P&gt;Any guidance or workaround would be greatly appreciated.&lt;/P&gt;&lt;P&gt;Thank you in advance!&lt;/P&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;/DIV&gt;</description>
    <pubDate>Tue, 27 May 2025 14:34:33 GMT</pubDate>
    <dc:creator>Mahtab67</dc:creator>
    <dc:date>2025-05-27T14:34:33Z</dc:date>
    <item>
      <title>Spark Kafka Client Not Using Certs from Default truststore</title>
      <link>https://community.databricks.com/t5/data-engineering/spark-kafka-client-not-using-certs-from-default-truststore/m-p/120330#M46141</link>
      <description>&lt;P&gt;Hi Team,&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;I'm working on connecting Databricks to an external Kafka cluster secured with SASL_SSL (SCRAM-SHA-512 + certificate trust). We've encountered an issue where certificates imported into the default JVM truststore (cacerts) via an init script are not picked up by Spark’s Kafka connector, unless we explicitly create and reference a .jks truststore.&lt;BR /&gt;What We've Done:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;for N in $(seq 0 $((CERTS - 1))); do
  ALIAS="custom-cert-$N"
  awk "n==$N{print} /END CERTIFICATE/{n++}" "$PEM_FILE" | \
    keytool -noprompt -import -trustcacerts \
    -alias "$ALIAS" -keystore "$KEYSTORE" -storepass "$PASSWORD"
done&lt;/LI-CODE&gt;&lt;P&gt;This correctly added entries like custom-cert-0 and custom-cert-1 to the cacerts store. We verified them using keytool -list.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Despite this, .write.format("kafka") failed with SSL handshake errors until we disabled hostname verification with:&lt;/P&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;/DIV&gt;&lt;/DIV&gt;&lt;LI-CODE lang="markup"&gt;.option("kafka.ssl.endpoint.identification.algorithm", "")&lt;/LI-CODE&gt;&lt;P&gt;and we could push messages to kafka topic and read them from databricks notebook but not securely. NOTICE we also used abfss and volume, and both could not resolve our problem.&amp;nbsp;We were able to push and read messages from Kafka this way, but it’s not fully secure. We tried both abfss:// and /mnt volume mounting approaches, but neither allowed us to reliably use the .jks truststore. Since we're also using Unity Catalog, we're restricted from referencing paths under /dbfs, in the code directly which further limits our options.&lt;/P&gt;&lt;P&gt;Any guidance or workaround would be greatly appreciated.&lt;/P&gt;&lt;P&gt;Thank you in advance!&lt;/P&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;/DIV&gt;</description>
      <pubDate>Tue, 27 May 2025 14:34:33 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/spark-kafka-client-not-using-certs-from-default-truststore/m-p/120330#M46141</guid>
      <dc:creator>Mahtab67</dc:creator>
      <dc:date>2025-05-27T14:34:33Z</dc:date>
    </item>
    <item>
      <title>Re: Spark Kafka Client Not Using Certs from Default truststore</title>
      <link>https://community.databricks.com/t5/data-engineering/spark-kafka-client-not-using-certs-from-default-truststore/m-p/120354#M46144</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/166115"&gt;@Mahtab67&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This is a common issue with Databricks and Kafka SSL connectivity.&lt;BR /&gt;The problem stems from how Spark's Kafka connector handles SSL context initialization versus the JVM's default truststore.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Root Cause Analysis:&lt;/STRONG&gt;&lt;BR /&gt;The Spark Kafka connector creates its own SSL context and doesn't automatically inherit certificates from the JVM's default cacerts truststore.&lt;BR /&gt;When you disable hostname verification, you're bypassing certificate validation entirely, which explains why it works but isn't secure.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Solution Options&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;Option 1: JVM System Properties (Recommended)&lt;/STRONG&gt;&lt;BR /&gt;Set JVM-level SSL properties in your cluster configuration. This forces all SSL connections to use your custom truststore:&lt;BR /&gt;Cluster Spark Config:&lt;/P&gt;&lt;P&gt;spark.driver.extraJavaOptions -Djavax.net.ssl.trustStore=/path/to/your/truststore.jks -Djavax.net.ssl.trustStorePassword=your_password -Djavax.net.ssl.trustStoreType=JKS&lt;BR /&gt;spark.executor.extraJavaOptions -Djavax.net.ssl.trustStore=/path/to/your/truststore.jks -Djavax.net.ssl.trustStorePassword=your_password -Djavax.net.ssl.trustStoreType=JKS&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Option 2: Kafka-Specific SSL Configuration&lt;/STRONG&gt;&lt;BR /&gt;Instead of relying on the default truststore, explicitly configure Kafka SSL options:&lt;/P&gt;&lt;P&gt;df.write \&lt;BR /&gt;.format("kafka") \&lt;BR /&gt;.option("kafka.bootstrap.servers", "your-kafka-servers:9093") \&lt;BR /&gt;.option("kafka.security.protocol", "SASL_SSL") \&lt;BR /&gt;.option("kafka.sasl.mechanism", "SCRAM-SHA-512") \&lt;BR /&gt;.option("kafka.sasl.jaas.config", "org.apache.kafka.common.security.scram.ScramLoginModule required username='user' password='pass';") \&lt;BR /&gt;.option("kafka.ssl.truststore.location", "/path/to/truststore.jks") \&lt;BR /&gt;.option("kafka.ssl.truststore.password", "truststore_password") \&lt;BR /&gt;.option("kafka.ssl.truststore.type", "JKS") \&lt;BR /&gt;.option("kafka.ssl.endpoint.identification.algorithm", "https") \&lt;BR /&gt;.save()&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Option 3: Unity Catalog Compatible Approach&lt;/STRONG&gt;&lt;BR /&gt;Since you're using Unity Catalog, store your truststore in a volume and reference it properly:&lt;BR /&gt;1. Create/Upload truststore to Unity Catalog volume:&lt;/P&gt;&lt;P&gt;-- Create volume if not exists&lt;BR /&gt;CREATE VOLUME IF NOT EXISTS your_catalog.your_schema.kafka_certs;&lt;/P&gt;&lt;P&gt;2. Upload your .jks file to the volume via UI or:&lt;/P&gt;&lt;P&gt;# Copy from local filesystem to volume&lt;BR /&gt;dbutils.fs.cp("file:/tmp/truststore.jks", "/Volumes/your_catalog/your_schema/kafka_certs/truststore.jks")&lt;/P&gt;&lt;P&gt;3. Reference in Kafka configuration:&lt;/P&gt;&lt;P&gt;truststore_path = "/Volumes/your_catalog/your_schema/kafka_certs/truststore.jks"&lt;/P&gt;&lt;P&gt;df.write \&lt;BR /&gt;.format("kafka") \&lt;BR /&gt;.option("kafka.bootstrap.servers", "your-servers:9093") \&lt;BR /&gt;.option("kafka.security.protocol", "SASL_SSL") \&lt;BR /&gt;.option("kafka.sasl.mechanism", "SCRAM-SHA-512") \&lt;BR /&gt;.option("kafka.sasl.jaas.config", jaas_config) \&lt;BR /&gt;.option("kafka.ssl.truststore.location", truststore_path) \&lt;BR /&gt;.option("kafka.ssl.truststore.password", truststore_password) \&lt;BR /&gt;.save()&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Option 4: Init Script with Proper JVM Configuration&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;Modify your init script to not only import certificates but also set system properties:&lt;/P&gt;&lt;P&gt;#!/bin/bash&lt;/P&gt;&lt;P&gt;# Your existing certificate import logic&lt;BR /&gt;for N in $(seq 0 $((CERTS - 1))); do&lt;BR /&gt;ALIAS="custom-cert-$N"&lt;BR /&gt;awk "n==$N{print} /END CERTIFICATE/{n++}" "$PEM_FILE" | \&lt;BR /&gt;keytool -noprompt -import -trustcacerts \&lt;BR /&gt;-alias "$ALIAS" -keystore "$KEYSTORE" -storepass "$PASSWORD"&lt;BR /&gt;done&lt;/P&gt;&lt;P&gt;# Create a separate truststore for Kafka&lt;BR /&gt;KAFKA_TRUSTSTORE="/databricks/driver/kafka-truststore.jks"&lt;BR /&gt;cp "$KEYSTORE" "$KAFKA_TRUSTSTORE"&lt;/P&gt;&lt;P&gt;# Set environment variables&lt;BR /&gt;echo "export KAFKA_TRUSTSTORE_LOCATION=$KAFKA_TRUSTSTORE" &amp;gt;&amp;gt; /databricks/spark/conf/spark-env.sh&lt;BR /&gt;echo "export KAFKA_TRUSTSTORE_PASSWORD=$PASSWORD" &amp;gt;&amp;gt; /databricks/spark/conf/spark-env.sh&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Best Practice Recommendation&lt;/STRONG&gt;&lt;BR /&gt;I recommend Option 3 (Unity Catalog approach) combined with explicit Kafka SSL configuration because:&lt;BR /&gt;It's Unity Catalog compliant&lt;BR /&gt;Provides explicit control over SSL settings&lt;BR /&gt;Maintains security best practices&lt;BR /&gt;Is auditable and manageable&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 27 May 2025 17:48:33 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/spark-kafka-client-not-using-certs-from-default-truststore/m-p/120354#M46144</guid>
      <dc:creator>lingareddy_Alva</dc:creator>
      <dc:date>2025-05-27T17:48:33Z</dc:date>
    </item>
  </channel>
</rss>

