cancel
Showing results for 
Search instead for 
Did you mean: 
Community Platform Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results for 
Search instead for 
Did you mean: 

Databricks + confluent schema registry - Schema not found error

ravitejasutrave
New Contributor

I am running a Kafka producer code on Databricks 12.2. I am testing AVRO serialization of message with help of confluent schema registry. I configured 'to_avro' function to read the schema from schema registry, but I am getting the below error

> org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 99.0 failed 4 times, most recent failure: Lost task 0.3 in stage 99.0 (TID 125) (10.52.30.102 executor 0): org.spark_project.confluent.kafka.schemaregistry.client.rest.exceptions.RestClientException: Schema not found; error code: 40403

df_tpch_orders.withColumn("value",to_avro(
data = df_tpch_orders.o_comment,
options = schema_registry_options,
schemaRegistryAddress = schema_registry_address,
subject = F.lit("demo-topic-value"),
))

The code works when I pass the avro schema instead of reading from the schema registry. I also confirmed that the scheme is present under the correct subject name.

I found this , which says the it might be because the data is incompatible with the provided schema in a data but this is not true in my case because when I hardcode (or read the schema manually from using confluent API) works.

1 REPLY 1

Kaniz_Fatma
Community Manager
Community Manager

Hi @ravitejasutrave

  • Ensure that the schema is compatible with the data you’re trying to serialize.
  • Double-check your configuration for connecting to the schema registry. Make sure that the schemaRegistryAddress points to the correct URL where your schema registry is running.
  • Confirm that the subject name ("demo-topic-value") matches the subject under which the schema is registered in the schema registry.
  • There have been improvements and bug fixes related to schema registry integration in newer versions of Kafka.
  • Consider upgrading your Kafka version to 2.8.0 or above. This might resolve the issue
  • When configuring your Structured Streaming writer, try setting the option "kafka.enable.idempotence" to "false".
  • This can sometimes help avoid issues related to the schema registry
  • Ensure that you have the correct dependencies (including the Confluent Schema Registry client) in your Databricks environment.
  • Verify that the versions of the Kafka client, Avro, and the schema registry client are compatible.

Let me know if this doesn't help.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group