I have a notebook in Databricks in which I am streaming a Pub/sub topic. The code for this looks like following-
%pip install --upgrade google-cloud-pubsub[pandas]
from pyspark.sql import SparkSession
authOptions={"clientId" : "123","clientEmail" : "123@project-id.iam.gserviceaccount.com", "privateKey" : "-----BEGIN PRIVATE KEY-----1234-----END PRIVATE KEY-----\n","privateKeyId" : "1234"}
stream=spark.readStream.format("pubsub").option("subscriptionId","firstfuel-reporting-test-subscription").option("topicId","firstfuel-reporting-test").option("projectId","project-id").options(**authOptions).load()
decodedStream = stream.withColumn("decodedData", stream["payload"].cast("string"))
result = decodedStream.writeStream.outputMode("append").format("console").start()
When I run this, I can see that streaming starts successfully and any mesages published on the Pub/sub topic are acknowledged right away. But ,I am not able to see exact payload printed on console. How can I do that. If I have to use received messages for any other purpose, how can I do that? I am attaching a view of what I am seeing after streaming starts below-