Hello everyone,
I need your help with a topic that has been preoccupying me for a few days.
"from_avro" function gives me a strange result when I pass it the json schema of a Kafka topic.
====================================================================
lastVersion = schema_registry_client.get_latest_version(topic_name +"-value")
print(f"last schema structure is {lastVersion.schema.schema_str}")
The schema looks pretty simple.
====================================================================
schema_json = """{
"namespace": "aaa.kafkaday.planedemo",
"type": "record",
"name": "PlaneDemo",
"fields": [
{"name": "FlightNumber", "type": "string"},
{"name": "PositionTime", "type": "long"},
{"name": "Lat", "type": "double"},
{"name": "Long", "type": "double"},
{"name": "Alt", "type": "double"}
]
}"""
display(df.select(
from_avro( col("value"),
schema_json, from_avro_options)
))
Using from_avro on the same dataset with SchemaRegistry information it works well.
====================================================================
display(df.select(
from_avro(data = F.col("value"),
subject = topic_name +"-value",
schemaRegistryAddress = schema_registry_address,
options = schema_registry_options).alias("value")
).selectExpr("*"))
A suggestion on where I'm going wrong ?
Thank you.