Databricks Community

AtanuC · ‎09-19-2023

Hello Expert,

I have a doubt so I need your advice and opinion on below query.

Does OOP is a good chioce of programming for distributed data processing ? like Pysaprk in Databricks platform ? If not then what it is and what kinfd of challenges could be there ? or will the functional programming approach be the best option in this case?

Please help me to understand the concept.

Thanks!!!

NandiniN · ‎01-31-2025

Functional programming is generally better suited for distributed data processing with PySpark on Databricks due to its emphasis on immutability, stateless operations, and higher-order functions. These features align well with Spark's execution model and make it easier to write scalable and maintainable code. While OOP can be used, it introduces additional complexity and challenges that are better avoided in a distributed environment.

https://docs.databricks.com/en/pyspark/index.html

Databricks Community

OOP programming in Pyspark on Databricks platform

Join Us as a Local Community Builder!

Level Up with Databricks Specialist Sessions

The next BrickTalks about the latest and greatest in AI/BI is scheduled for Oct 28!

BrickCon 2025 — Dec 3–5 | A Community Conference for Databricks Builders

Solution Accelerator Series | #5 - Automating Product Review Summarization with LLMs

Introducing Community Pulse — Your Weekly Databricks Roundup!