Hi all,
Environment:
Nodes: Standard_E8s_v3
Databricks Runtime: 9.0
.NET for Apache Spark 2.0.0
I'm invoking spark submit to run a .Net Spark job hosted in Azure Databricks. The job is written in C#.Net with its only transformation and action, reading a CSV then displaying its records. The job has been running swimingly for months until recently where I've noticed it will not self-terminate after completion. The job perform the work then remain active indefinitely until I manually terminate it.
This is the app's code:
SparkSession spark = SparkSession
.Builder()
.AppName("My App Name")
.GetOrCreate();
string sourcePath = args[0];
DataFrame df = spark
.Read()
.Option("header", "true")
.Option("quote", "\"")
.Csv(sourcePath);
df.Show();
spark.Stop();
I've attached a dump of the driver's Log4j output.
Edit 16/12/2021:
Problem is possibly related to the workers refusing to shutdown after completing their work as indicated by the workers' stderr ouput final entry...
21/12/16 00:23:10 INFO DBFS: Initialized DBFS with DBFSV2 as the delegate.
21/12/16 00:23:10 INFO Utils: resolved command to be run: WrappedArray(getconf, PAGESIZE)
21/12/16 00:23:10 INFO FileScanRDD: Reading File path: dbfs:/mnt/opstats/raw/snakes/intercom.csv, range: 0-695442, partition values: [empty row], modificationTime: 1639438344000.
21/12/16 00:23:10 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 1703 bytes result sent to driver
21/12/16 00:23:16 INFO CoarseGrainedExecutorBackend: Driver commanded a shutdown
21/12/16 00:23:16 INFO MemoryStore: MemoryStore cleared
21/12/16 00:23:16 INFO BlockManager: BlockManager stopped
21/12/16 00:23:16 ERROR CoarseGrainedExecutorBackend: RECE
Is anybody able to shed some light on this mysterious issue?
Thanks
Tim.