cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Ingesting Data from Event Hubs via Kafka API with Serverless Compute

1GauravS
New Contributor III

Hi!

I'm currently working on ingesting log data from Azure Event Hubs into Databricks. Initially, I was using a managed Databricks workspace, which couldn't access Event Hubs over a private endpoint. To resolve this, our DevOps team provisioned a VNet-injected workspace within the same virtual network as Event Hubs. This allowed successful ingestion, but only when using classic compute. Unfortunately, serverless compute still doesn't support this private endpoint setup.

Has anyone found a workaround for using serverless compute with Event Hubs over private endpoints? Or is classic compute the only viable option in this scenario?



Below is the DLT pipeline code that I am using:

@dlt.table(
  name="poc_event_hub_process",
  comment="Raw ingestion from Event Hubs/Kafka into bronze layer",
  table_properties={"quality": "bronze", "delta.enableChangeDataFeed": "true"}
)
def poc_event_hub_process():
    df = (spark.readStream
            .format("kafka")
            .option("kafka.bootstrap.servers", bootstrap_servers)
            .option("subscribe", event_hub)
            .option("kafka.security.protocol", "SASL_SSL")
            .option("kafka.sasl.mechanism", "PLAIN")
            .option("kafka.sasl.jaas.config",
                    f'kafkashaded.org.apache.kafka.common.security.plain.PlainLoginModule required username="{sasl_username}" password="{sasl_password}";')
            .option("failOnDataLoss", "false")
            .option("startingOffsets", "earliest")
            .load()
    )
    df = df.withColumn('value', col("value").cast("string"))
    return df.withColumn("value", from_json(col("value"), json_schema))

This is the error that I get when I change the compute to Serverless in the DLT pipeline settings:

terminated with exception: kafkashaded.org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node assignment. Call: describeTopics


It seems like the connection is not being established, is there something that can be done to resolve this? 

I'd appreciate any insights or experiences from others who've tackled similar setups, especially if you've managed to get serverless compute to ingest from EH. Thanks in advance!

1 ACCEPTED SOLUTION

Accepted Solutions

1GauravS
New Contributor III

Hi @mark_ott , Thanks for your response.

I followed below mentioned documentation to configure private connectivity with Azure resources and was able to ingest logs using serverless compute. Having NCC set up is the key here.

https://learn.microsoft.com/en-us/azure/databricks/security/network/serverless-network-security/serv...

View solution in original post

2 REPLIES 2

mark_ott
Databricks Employee
Databricks Employee

Serverless compute in Azure Databricks does not support accessing resources over private endpoints, such as Azure Event Hubs configured with a private endpoint. This is a known and frequently cited limitation in the Databricks documentation and community forums as of late 2025. Your experience—where classic (VNet-injected) compute works but serverless does not—is directly attributed to this networking architecture difference.

Why Serverless Fails with Private Endpoints

  • Serverless compute clusters are deployed in a Databricks-managed VNet, not in your own customer VNet.

  • These managed VNets cannot 'see' private endpoints deployed within your customer VNet, so connection attempts from serverless compute to resources exposed only via private endpoint fail.

  • This is why you see the TimeoutException: Timed out waiting for a node assignment. Call: describeTopics error—the traffic does not reach your Event Hub.

Current Viable Workarounds

  • Classic compute (VNet-injected) is currently the only viable supported method for accessing private resources like Event Hubs with private endpoints from Databricks.

  • No official Azure-supported or community-documented workaround exists for making serverless compute work with private endpoints for Event Hubs.

  • If you must use serverless compute, your only option is to expose Event Hubs through public endpoints, which usually violates security/compliance requirements in enterprise environments.

Additional Considerations

  • There have been requests for this serverless/private endpoint integration, but as of late 2025, it remains unsupported.

  • Some have explored custom network appliances or firewall rules, but none reliably resolve the lack of direct network routability between serverless compute and customer VNets hosting the private endpoint.

  • You may consider using classic compute with autoscaling as a 'serverless-like' cost control mechanism until full support is available.

Recommended Actions

  • Continue using classic (VNet-injected) compute for workloads requiring access to Event Hubs via private endpoints.

  • Periodically check Databricks and Azure roadmaps/documentation for updates on private endpoint support for serverless compute, as this is a highly requested feature.

  • Re-architecting for serverless compute is not recommended if private endpoint access is mandatory.

References

  • [Azure Databricks Networking and Serverless Compute Limitations].

  • [Community and Microsoft Forum threads regarding Serverless + Private Endpoint].

If you have further questions or your use-case requires a more detailed design workaround, sharing specifics may help community experts suggest alternative architectures.


Summary Table

Compute Type Private Endpoint Access Notes
Classic (VNet-inject) Yes Fully supported, recommended for this use case
Serverless No No support as of Oct 2025
 
 

There is unfortunately no workaround at this time: classic compute remains the only reliable way to use Event Hubs over private endpoints with Databricks.

1GauravS
New Contributor III

Hi @mark_ott , Thanks for your response.

I followed below mentioned documentation to configure private connectivity with Azure resources and was able to ingest logs using serverless compute. Having NCC set up is the key here.

https://learn.microsoft.com/en-us/azure/databricks/security/network/serverless-network-security/serv...