Receiving Null values from Eventhub streaming.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-14-2024 11:55 PM
Hi, I am new to PySpark, and facing an issue while consuming data from the Azure eventhub. I am unable to deserialize the consumed data. I see only null values upon deserializing data using the schema. Please find the below schema, eventhub message, and code that I am using to consume data and let me know how can I resolve this issue. Thanks in advance.
Sample Eventhub message:
EventHubOverrideMessage(gtin13=00******0010, sourceId=0******5, lastUpdateTimestamp=2024-07-09T12:45:00.009805, lastUpdatedUser=null, inStore=ModalityOverride(reason=Other, override=true, startDate=2023-10-23T00:00, endDate=null), pickup=ModalityOverride(reason=Other, override=true, startDate=2023-10-23T00:00, endDate=null), delivery=ModalityOverride(reason=Other, override=true, startDate=2023-10-23T00:00, endDate=null), ship=null)
Deserialization Schema: (Tried replacingTimestampType with stringtype)
Actual Schema:
CODE:
PRINT SCHEMA OUTPUT:
Schema of Kafka DataFrame: root |-- key: binary (nullable = true) |-- value: string (nullable = true) |-- topic: string (nullable = true) |-- partition: integer (nullable = true) |-- offset: long (nullable = true) |-- timestamp: timestamp (nullable = true) |-- timestampType: integer (nullable = true)
Deserialized Output:
{"gtin13":null,"sourceId":null,"lastUpdateTimestamp":null,"inStore":null,"pickup":null,"delivery":null,"ship":null}

