cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Databricks as a "pure" data streaming software like Confluent

noorbasha534
Contributor

Dears

I was wondering if anyone has leveraged Databricks as a "pure" data streaming software in place of Confluent, Flink, Kafka etc.

I see the reference architectures placing Databricks on the data processing side mostly once data is made available by Confluent or Flink or Kafka.

Appreciate if you can share your insights.

1 REPLY 1

szymon_dybczak
Contributor III

Hi @noorbasha534 ,

It depends on what you're asking for. Kafka is primarily a messaging system, optimized for handling high-throughput, distributed message logs. Databricks can read from Kafka as a data source but doesn't replace Kafka's role in message distribution. 
But if you're comparing Kafka Streams (which is Kafka offering for stream processing) with Apache Spark Structured Streaming (which Databricks uses for stream processing)  then yes, I think Databricks streaming capablilities are top-notch and you can use it instead of Kafka and you'll be happy.

 As for Apache Flink, it is known for low-latency, stateful, and complex event processing. If your streaming use case involves complex operations, maybe Flink would be a better choice. But with the intensive development of spark strucutred streaming, this boundary is blurring.

 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group