cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Nastasia
by New Contributor II
  • 2175 Views
  • 3 replies
  • 1 kudos

Why is Spark creating multiple jobs for one action?

I noticed that when launching this bunch of code with only one action, I have three jobs that are launched.from pyspark.sql import DataFrame from pyspark.sql.types import StructType, StructField, StringType from pyspark.sql.functions import avgdata:...

https___i.stack.imgur.com_xfYDe.png LTHBM DdfHN
  • 2175 Views
  • 3 replies
  • 1 kudos
Latest Reply
RKNutalapati
Valued Contributor
  • 1 kudos

The above code will create two jobs.JOB-1. dataframe: DataFrame = spark.createDataFrame(data=data,schema=schema)The createDataFrame function is responsible for inferring the schema from the provided data or using the specified schema.Depending on the...

  • 1 kudos
2 More Replies
Charmin
by New Contributor
  • 541 Views
  • 1 replies
  • 0 kudos

Why 'runCommand' action does NOT show up in databricksNotebook audit log table?

I understand databricks can send diagnostic/audit logs to log analytics in azure. There is a standard 'DatabricksNotebook' table that provides audit log for notebook actions. In this table there is an action called 'runCommand' but this does not show...

  • 541 Views
  • 1 replies
  • 0 kudos
Latest Reply
rsamant07
New Contributor III
  • 0 kudos

Hi @Charmin patel​  , you need to enable verbose audit logging in workspace setting for runCommand to appear in the audit logs

  • 0 kudos
Mradul07
by New Contributor II
  • 404 Views
  • 0 replies
  • 1 kudos

Spark behavior while dealing with Actions & Transformations ?

Hi, My question is - what happens to the initial RDD after the action is performed on it. Does it disappear or stays in the memory or does it needs to be explicitly cached() if we want to use it again.For eg : If I execute this in a sequence :df_outp...

  • 404 Views
  • 0 replies
  • 1 kudos
Anand_Ladda
by Honored Contributor II
  • 2192 Views
  • 1 replies
  • 0 kudos
  • 2192 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anand_Ladda
Honored Contributor II
  • 0 kudos

Spark's execution engine is designed to be Lazy. In effect, you're first up build up your analytics/data processing request through a series of Transformations which are then executed by an ActionTransformations are kind of operations which will tran...

  • 0 kudos
Labels