cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results for 
Search instead for 
Did you mean: 

Using SQL for Structured Streaming

chloeh
New Contributor II

Hi!

I'm new to Databricks. I'm trying to create a data pipeline with structured streaming. A minimal example data pipeline would look like: read from upstream Kafka source, do some data transformation, then write to downstream Kafka sink. I want to do as much of this in SQL as possible, but I'm encountering some issues.

1. My understanding is that creating sources and sinks via raw SQL is not supported in Spark, is that true?

2. I found a new `read_kafka` table-valued function in Databricks SQL, but I can't seem to be able to use it in the community edition. It's giving me ```could not resolve `read_kafka` to a table-valued function.```. Is creating sources and sinks using raw SQL only available in the enterprise version of Databricks SQL (i.e., it's not supported in Spark SQL or the community version)?

3. Is WATERMARK clause in SQL only supported in Databricks SQL, not Spark SQL?

4. In general, is there a different in support between Databricks SQL in community edition vs Databricks SQL in enterprise edition?

Thank you in advance!

1 REPLY 1

chloeh
New Contributor II

Ok I figured out why I was getting an error on the usage of `read_kafka`. My default cluster was set up with the wrong Databricks runtime

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!