cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Unable to stream from google pub/sub

210573
New Contributor

I am trying to run below for subscribing to a pubsub but this code is throwing this exception

java.lang.NoClassDefFoundError: org/apache/spark/sql/sources/v2/DataSourceV2

I have tried using all versions of https://mvnrepository.com/artifact/com.google.cloud/pubsublite-spark-sql-streaming no luck so far.

from pyspark.sql import SparkSession

spark = SparkSession.builder.appName('Simple Pub/Sub Lite Read').getOrCreate()

df = spark.readStream \

 .format("pubsublite") \

 .option("pubsublite.subscription", "My subscription path") \

 .option("gcp.credentials.key", "my gcp credential").load()

df.show(10, False)

3 REPLIES 3

-werners-
Esteemed Contributor III

can you retry without creating a sparksession? As databricks provides one for you.

Noopur_Nigam
Databricks Employee
Databricks Employee

Hi @cloud user​ As of now, we do not have structured streaming support with Pub/Sub. Below are the supported sources with structured streaming:

https://docs.gcp.databricks.com/spark/latest/structured-streaming/data-sources.html

Ajay-Pandey
Esteemed Contributor III

Hi @210573 

Databricks now start supporting pub/sub streaming natively now you can start using pubsub streaming for your use case for more info visit below official URL -

PUB/SUB with Databricks 

Ajay Kumar Pandey

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group