I am running a Kafka producer code on Databricks 12.2. I am testing AVRO serialization of message with help of confluent schema registry. I configured 'to_avro' function to read the schema from schema registry, but I am getting the below error
> org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 99.0 failed 4 times, most recent failure: Lost task 0.3 in stage 99.0 (TID 125) (10.52.30.102 executor 0): org.spark_project.confluent.kafka.schemaregistry.client.rest.exceptions.RestClientException: Schema not found; error code: 40403
df_tpch_orders.withColumn("value",to_avro(
data = df_tpch_orders.o_comment,
options = schema_registry_options,
schemaRegistryAddress = schema_registry_address,
subject = F.lit("demo-topic-value"),
))
The code works when I pass the avro schema instead of reading from the schema registry. I also confirmed that the scheme is present under the correct subject name.
I found this , which says the it might be because the data is incompatible with the provided schema in a data but this is not true in my case because when I hardcode (or read the schema manually from using confluent API) works.