cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results for 
Search instead for 
Did you mean: 

Support for managed identity based authentication in python kafka client

Kruthika
New Contributor

We followed this document https://docs.databricks.com/aws/en/connect/streaming/kafka?language=Python#msk-aad to use Kafka clie...

As part of the SFI, the guidance is to move away from client secret and use managed identity instead. As per our investigation so far, we have not find a way to do the same. Can you please guide us how we can do the same in this case?

What we have tried so far?

  1. To create token and pass directly - did not work
    kafka_options = {
        "kafka.bootstrap.servers": brokers,
        "subscribe": topic,
        "kafka.security.protocol": "SASL_SSL",
        "kafka.sasl.mechanism": "OAUTHBEARER",
        "kafka.sasl.jaas.config": f'org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required oauth.token="{oauth_token}";',
        "kafka.sasl.login.callback.handler.class": "org.apache.kafka.common.security.oauthbearer.secured.OAuthBearerLoginCallbackHandler"
    }

    What are trying currently?

    1. To use custom handler instead of the default handler. We are stuck here because right libraries are not found

    Our current code with secret:

    sasl_config = f'kafkashaded.org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required clientId="{client_id}" clientSecret="{client_secret}" scope="https://{event_hubs_server}/.default" ssl.protocol="SSL";'
    KAFKA_OPTIONS = {
      "kafka.bootstrap.servers"  : f"{event_hubs_server}:9093",
      "subscribe"                : event_hubs_topic,
      "kafka.sasl.mechanism"     : "OAUTHBEARER",
      "kafka.security.protocol"  : "SASL_SSL",
      "kafka.sasl.jaas.config"   : sasl_config,
      "kafka.sasl.oauthbearer.token.endpoint.url": f"<url>",
      "kafka.sasl.login.callback.handler.class": "kafkashaded.org.apache.kafka.common.security.oauthbearer.secured.OAuthBearerLoginCallbackHandler",
      "kafka.request.timeout.ms" : "60000",
      "kafka.session.timeout.ms" : "60000",
      "maxOffsetsPerTrigger"     : "1000",
      "failOnDataLoss"           : "false",
      "startingOffsets"          : "earliest"
    }
1 REPLY 1

mark_ott
Databricks Employee
Databricks Employee

Currently, Databricks does not support using Managed Identities directly for Kafka client authentication (e.g., MSK IAM or Event Hubs Kafka endpoint) in Python Structured Streaming connections. However, there is a supported and secure alternative that aligns with your SFI goal of eliminating client secrets — Unity Catalog service credentials configured with a Managed Identity–based access connector.

Current State of Managed Identity for Kafka in Databricks

Managed Identity–based OAuth authentication for Kafka clients is not yet supported natively in Databricks streaming readers or writers for Kafka on AWS or Azure. As of 2025, Databricks recommends replacing traditional credential-based authentication (client secrets, certificates) with Unity Catalog service credentials that encapsulate a Managed Identity or instance profile for Kafka access.​

Recommended Approach Using Unity Catalog Service Credentials

To align with your SFI directive and eliminate client secrets:

  1. Create a Managed Identity and Access Connector

    • In Azure, set up an Azure Databricks access connector bound to a user-assigned managed identity.

    • Grant this managed identity access to your target service (MSK or Event Hubs).

    • Record the access connector’s Resource ID.​

  2. Create a Unity Catalog Service Credential

    • In Databricks, create a new service credential linked to that access connector using the Azure portal or the Databricks catalog UI.

    • Example command:

      sql
      CREATE SERVICE CREDENTIAL my_kafka_sc WITH ID '/subscriptions/<sub-id>/resourceGroups/<rg>/providers/Microsoft.Databricks/accessConnectors/<connector-name>';
    • Optionally, include your user-assigned managed identity ID.

  3. Reference the Service Credential in Kafka Configuration

    • Replace the old secret-based Kafka auth block in your Spark code with:

      python
      df = (spark.readStream .format("kafka") .option("databricks.serviceCredential", "my_kafka_sc") .option("kafka.bootstrap.servers", "<bootstrap-server-url>") .option("subscribe", "<topic>") .load())
    • When the databricks.serviceCredential option is used, you should not include SASL, JAAS, or protocol configuration parameters (kafka.sasl.mechanism, kafka.security.protocol, etc.) — Databricks manages those using the bound managed identity.​

Availability and Considerations

  • This feature is available starting in Databricks Runtime 16.1 and later.

  • Works across AWS MSK and Azure Event Hubs with Managed Identity or Instance Profile.

  • Ideal for serverless or shared compute environments where secret injection is discouraged.​

  • For older runtimes or environments without Unity Catalog, the only supported options remain IAM (AWS instance profile) or Entra ID client secret–based OAuth.​

In Summary

If your environment is on Databricks Runtime 16.1 or higher, use Unity Catalog service credentials connected to an Azure Managed Identity to securely authenticate to Kafka (MSK/Event Hubs) without relying on a client secret. This model satisfies SFI governance by removing embedded secrets and leveraging Azure-managed tokens.