cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Exploring the Use of Databricks as a Transactional Database

amoralca
New Contributor

Hey everyone, Iโ€™m currently working on a project where my team is thinking about using Databricks as a transactional database for our backend application. We're familiar with Databricks for analytics and big data processing, but we're not sure if itโ€™s the right fit for handling real-time transactional workloads. Has anyone in the community successfully used Databricks for this purpose? Is it a good idea, or would it be better to stick with traditional transactional databases? If you have any experience, success stories, or advice, Iโ€™d really appreciate hearing about it. Looking forward to your insights! Best,

3 REPLIES 3

Slash
Contributor

Hi @amoralca ,

Databricks is mainly used for Big data processing. In my opinion it's not the best choice for OLTP database. You spin all those cluster nodes, but then your workload is transactional in nature so you're wasting all that compute power.

Additionally, lakehouse is heavily dependent on 'big data file formats' like parquet, delta lake, orc, iceberg etc.These are typically immutable.In an oltp system you have to do a lot of small synchrone updates which is cumbersome in a lakehouse


But this is interesting question and I'd like to hear more voices on this topic.

Kaniz_Fatma
Community Manager
Community Manager

Hi @amoralca, Thanks for reaching out! Please review the responses and let us know which best addresses your question. Your feedback is valuable to us and the community. If the response resolves your issue, kindly mark it as the accepted solution. This will help close the thread and assist others with similar queries. We appreciate your participation and are here if you need further assistance!

Edthehead
Contributor

My 2 cents, Databricks Lakehouse is like a DWH which is similar to Azure Synapse dedicated pool and meant for a certain purpose. With all that power comes a limitation in concurrency and number of queries that can run in parallel. So, it's great if you are loading large data into it or performing analytical queries. But if you are going to have 100s-1000s of queries and inserts, I do not see it as a good fit. These queries and single inserts will not be using spark at all. Normal SQL DBs come with comparatively lower storage limits but have good concurrency for small queries and inserts. Technically though, you can still use a Databricks lakehouse as a OLTP DB. 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group