Databricks Community

ls · ‎01-13-2025

As the title suggests I have a bunch of lambda functions within my notebooks and I wanted to know if it is considered to be "bad" to have them in there.

output_list = json_files.mapPartitions(lambda partition: iter([process_partition(partition)])) \

.filter(lambda df: not df.empty) \

.flatMap(lambda df: df.to_dict(orient="records")) \

.map(lambda row_dict: Row(**row_dict)) \

.toDF()

The code above works but I wanted to know if it is ok to have that many lambda functions together.

Alberto_Umana · ‎01-13-2025

Hi @ls,

Using multiple lambda functions in your code is not necessarily bad, you should consider the readability, maintainability, and reusability of your code. If the lambda functions are simple and the logic is clear, then it's fine to use them. Otherwise, consider defining named functions to improve the overall quality of your code

View solution in original post

Satyadeepak · ‎01-13-2025

Using lambda functions within notebooks is not inherently "bad," but there are some considerations to keep in mind. While this code is functional, chaining multiple lambda functions can reduce readability and debugging capabilities in Databricks notebooks. Error tracebacks become less informative.

If there is any performance implications, it is difficult to add logging and inspect intermediate results

View solution in original post

Alberto_Umana · ‎01-13-2025

Hi @ls,

Using multiple lambda functions in your code is not necessarily bad, you should consider the readability, maintainability, and reusability of your code. If the lambda functions are simple and the logic is clear, then it's fine to use them. Otherwise, consider defining named functions to improve the overall quality of your code

Satyadeepak · ‎01-13-2025

Using lambda functions within notebooks is not inherently "bad," but there are some considerations to keep in mind. While this code is functional, chaining multiple lambda functions can reduce readability and debugging capabilities in Databricks notebooks. Error tracebacks become less informative.

If there is any performance implications, it is difficult to add logging and inspect intermediate results