Data Engineering

Forum Posts

Sorted by:

by Nastasia • New Contributor II

07-29-2021 5:33:27 AM

4894 Views
2 replies
1 kudos

Why is Spark creating multiple jobs for one action?

I noticed that when launching this bunch of code with only one action, I have three jobs that are launched.from pyspark.sql import DataFrame from pyspark.sql.types import StructType, StructField, StringType from pyspark.sql.functions import avgdata:...

Data Engineering

4894 Views
2 replies
1 kudos

07-29-2021 5:33:27 AM

View Replies

Latest Reply

RKNutalapati
Valued Contributor

12-19-2023 1:33:53 PM

1 kudos

The above code will create two jobs.JOB-1. dataframe: DataFrame = spark.createDataFrame(data=data,schema=schema)The createDataFrame function is responsible for inferring the schema from the provided data or using the specified schema.Depending on the...

1 kudos

12-19-2023 1:33:53 PM

1 More Replies

by Charmin • New Contributor

12-20-2022 12:53:40 PM

1126 Views
1 replies
0 kudos

Why 'runCommand' action does NOT show up in databricksNotebook audit log table?

I understand databricks can send diagnostic/audit logs to log analytics in azure. There is a standard 'DatabricksNotebook' table that provides audit log for notebook actions. In this table there is an action called 'runCommand' but this does not show...

Data Engineering

1126 Views
1 replies
0 kudos

12-20-2022 12:53:40 PM

View Replies

Latest Reply

rsamant07
New Contributor III

03-17-2023 3:24:36 AM

0 kudos

Hi @Charmin patel , you need to enable verbose audit logging in workspace setting for runCommand to appear in the audit logs

0 kudos

03-17-2023 3:24:36 AM

by Mradul07 • New Contributor II

10-27-2022 3:20:05 PM

917 Views
0 replies
1 kudos

Spark behavior while dealing with Actions & Transformations ?

Hi, My question is - what happens to the initial RDD after the action is performed on it. Does it disappear or stays in the memory or does it needs to be explicitly cached() if we want to use it again.For eg : If I execute this in a sequence :df_outp...

Data Engineering

917 Views
0 replies
1 kudos

10-27-2022 3:20:05 PM

by aladda • Databricks Employee

06-19-2021 8:31:09 PM

3626 Views
1 replies
0 kudos

Resolved! What is the difference between a Transformation and Action in Spark?

Data Engineering

3626 Views
1 replies
0 kudos

06-19-2021 8:31:09 PM

View Replies

Latest Reply

aladda
Databricks Employee

06-19-2021 8:37:50 PM

0 kudos

Spark's execution engine is designed to be Lazy. In effect, you're first up build up your analytics/data processing request through a series of Transformations which are then executed by an ActionTransformations are kind of operations which will tran...

0 kudos

06-19-2021 8:37:50 PM

Databricks Community

Why is Spark creating multiple jobs for one action?

Why 'runCommand' action does NOT show up in databricksNotebook audit log table?

Spark behavior while dealing with Actions & Transformations ?

Resolved! What is the difference between a Transformation and Action in Spark?