Databricks Community

mattjones · ‎10-17-2022

Hi all - Matt Jones here, I’m on the Data Streaming team at Databricks and wanted to share a few takeaways from last week’s Current 2022 data streaming event (formerly Kafka Summit) in Austin.

By far the most common question we got at the booth was how/why customers would use Kafka/Confluent and Databricks together. A popular use case is to aggregate streaming events through a Kafka-based collector system, then send that event stream into a Databricks streaming pipeline (or roll your own with Spark Structured Streaming, if you prefer). Frank Munz’s blog post on this topic is an excellent overview.

In addition to a few of the sessions we had at the event, our head of streaming Karthik Ramasamy hosted a meetup that delved into the details of Project Lightspeed, our nextgen Structured Streaming work. As you may know, the meetup format is a great way to get into more conversational depth than a breakout session affords - for example, one of Karthik’s former students at UC Berkeley was getting into the details of how we handle async state checkpointing for low-latency pipelines.

I also had some productive dialogue around what Databricks users want from streaming - low latency is obviously a desirable outcome, but it must be balanced against cost and accuracy (given windowing considerations, late arriving data, etc). Then of course there are scale/throughput considerations. I’d love to hear how your organizations/teams approach this tradeoff.

The ubiquity of streaming use cases was my big takeaway from Current 2022. Performant streaming architecture isn’t a cutting edge set of use cases reserved for high tech; it’s really becoming a democratized practice for everyone from grocery stores to the public sector.

If you were at Current, what was the most impactful/interesting thing you got from the event? If you weren’t able to join us this year, please do add your voice - what’s on your data streaming wish list for the next year?

Databricks Community

Hi all - Matt Jones here, I’m on the Data Streaming team at Databricks and wanted to share a few takeaways from last week’s Current 2022 data streamin...

Join Us as a Local Community Builder!

🌟 Community Pulse: Your Weekly Roundup! November 21 – 27, 2025

Join us for another BrickTalk: Vibe-Coding Databricks Apps in Replit with Augusto!

Celebrating Our First Brickster Champion: Louis Frolio

⭐ Setup Spark with Hadoop Anywhere : A DBR aligned local Spark+HDFS+Hive stack on Docker⭐

Big Book of Data Engineering - Get how-tos, code snippets and real-world examples