Databricks Community

aladda · ‎06-19-2021

aladda · ‎06-19-2021

Spark's execution engine is designed to be Lazy. In effect, you're first up build up your analytics/data processing request through a series of Transformations which are then executed by an Action

Transformations are kind of operations which will transform your RDD data from one form to another. And when you apply this operation on any RDD, you will get a new RDD with transformed data Operations like map, filter are transformations.

Transformations create RDDs from each other, but when we want to work with the actual dataset, at that point action is performed. When the action is triggered after the result, new RDD is not formed like transformation. Ex:- count on a DF

View solution in original post

aladda · ‎06-19-2021

Spark's execution engine is designed to be Lazy. In effect, you're first up build up your analytics/data processing request through a series of Transformations which are then executed by an Action

Transformations are kind of operations which will transform your RDD data from one form to another. And when you apply this operation on any RDD, you will get a new RDD with transformed data Operations like map, filter are transformations.

Transformations create RDDs from each other, but when we want to work with the actual dataset, at that point action is performed. When the action is triggered after the result, new RDD is not formed like transformation. Ex:- count on a DF

Databricks Community

What is the difference between a Transformation and Action in Spark?

Photos

Connect with Databricks Users in Your Area

Get Started With Lakehouse Architecture | Pass a quiz to earn your certificate completion.

Databricks Community Champion - February 2025 - Stefan Koch

Virtual Learning Festival: 9 April - 30 April

Women’s Week Challenge: Play, Engage & Win Swag

Data + AI Summit 2025 — registration now open!