<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Naming jobs in the Spark UI in Databricks Runtime 15.4 in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/naming-jobs-in-the-spark-ui-in-databricks-runtime-15-4/m-p/84046#M37116</link>
    <description>&lt;P&gt;I am asking almost the same question as: &lt;A title="here" href="https://community.databricks.com/t5/data-engineering/how-to-improve-spark-ui-job-description-for-pyspark/td-p/48959" target="_self"&gt;https://community.databricks.com/t5/data-engineering/how-to-improve-spark-ui-job-description-for-pyspark/td-p/48959 .&amp;nbsp; I would like to know how to improve the readability of the Spark UI by naming jobs.&amp;nbsp;&amp;nbsp; I am using pyspark.&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Because I am running Databricks 15.4, I receive the following message when accessing the sparkContext:&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;EM&gt;&lt;EM&gt;&lt;EM&gt;&lt;EM&gt;[JVM_ATTRIBUTE_NOT_SUPPORTED] Directly accessing the underlying Spark driver JVM using the attribute 'sparkContext' is not supported on shared clusters. If you require direct access to these fields, consider using a single-user cluster. For more details on compatibility and limitations, check: &lt;A class="" href="https://docs.databricks.com/compute/access-mode-limitations.html#shared-access-mode-limitations-on-unity-catalog" target="_blank" rel="noopener noreferrer"&gt;&lt;EM&gt;https://docs.databricks.com/compute/access-mode-limitations.html#shared-access-mode-limitations-on-unity-catalog&lt;/EM&gt;&lt;/A&gt;&lt;/EM&gt;&lt;/EM&gt;&lt;/EM&gt;&lt;/EM&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;DIV class=""&gt;Accordingly, I do not think that I can use setJobDescription,&lt;SPAN&gt;&lt;SPAN&gt;&lt;SPAN&gt;&lt;SPAN&gt;&amp;nbsp;setName, and&lt;SPAN&gt;&amp;nbsp;setNameas outlined in that answer.&amp;nbsp;&amp;nbsp; Could you please give an example of naming jobs and tasks, including the python class which is called?&amp;nbsp;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;DIV class=""&gt;Could you also confirm what should be the effect of using df.alias("example_df") in the Spark UI?&amp;nbsp;&amp;nbsp;&lt;DIV class=""&gt;I would consider this question answered with these examples:&lt;BR /&gt;- Set the JobGroup name in the Spark UI, either from the driver or the worker.&lt;DIV class=""&gt;- Set the Job description in the Spark UI, either from the driver or the worker.&lt;DIV class=""&gt;- Set the Stage description in the Spark UI, either from the driver or the worker.&lt;DIV class=""&gt;- Set the descriptions on the visual blocks in the Dag visualization pages.&lt;DIV class=""&gt;&amp;nbsp;&lt;DIV class=""&gt;The example code, before annotations are added, could look like the following:&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;from pyspark.sql import SparkSession

def stream_parquet_to_delta(s3_path, delta_table_path‌):
  # Initialize Spark session
  spark = SparkSession.builder.appName("StreamToDeltaExample").getOrCreate()

  # Read streaming data from S3 in Parquet format
  streaming_df = spark.readStream.format("parquet").load(s3_path)

  # Define the function to process each micro-batch
  def process_batch(df, batch_id‌):
     # Write the micro-batch to Delta table
     df.write.format("delta").mode("append").save(delta_table_path)

# Write streaming data to Delta table using foreachBatch
streaming_df.writeStream.foreachBatch(process_batch).start().awaitTermination()

# Example usage
stream_parquet_to_delta("s3://my-bucket/streaming-data", "/delta-table/streaming_data")&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Fri, 23 Aug 2024 09:45:35 GMT</pubDate>
    <dc:creator>sunnyday</dc:creator>
    <dc:date>2024-08-23T09:45:35Z</dc:date>
    <item>
      <title>Naming jobs in the Spark UI in Databricks Runtime 15.4</title>
      <link>https://community.databricks.com/t5/data-engineering/naming-jobs-in-the-spark-ui-in-databricks-runtime-15-4/m-p/84046#M37116</link>
      <description>&lt;P&gt;I am asking almost the same question as: &lt;A title="here" href="https://community.databricks.com/t5/data-engineering/how-to-improve-spark-ui-job-description-for-pyspark/td-p/48959" target="_self"&gt;https://community.databricks.com/t5/data-engineering/how-to-improve-spark-ui-job-description-for-pyspark/td-p/48959 .&amp;nbsp; I would like to know how to improve the readability of the Spark UI by naming jobs.&amp;nbsp;&amp;nbsp; I am using pyspark.&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Because I am running Databricks 15.4, I receive the following message when accessing the sparkContext:&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;EM&gt;&lt;EM&gt;&lt;EM&gt;&lt;EM&gt;[JVM_ATTRIBUTE_NOT_SUPPORTED] Directly accessing the underlying Spark driver JVM using the attribute 'sparkContext' is not supported on shared clusters. If you require direct access to these fields, consider using a single-user cluster. For more details on compatibility and limitations, check: &lt;A class="" href="https://docs.databricks.com/compute/access-mode-limitations.html#shared-access-mode-limitations-on-unity-catalog" target="_blank" rel="noopener noreferrer"&gt;&lt;EM&gt;https://docs.databricks.com/compute/access-mode-limitations.html#shared-access-mode-limitations-on-unity-catalog&lt;/EM&gt;&lt;/A&gt;&lt;/EM&gt;&lt;/EM&gt;&lt;/EM&gt;&lt;/EM&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;DIV class=""&gt;Accordingly, I do not think that I can use setJobDescription,&lt;SPAN&gt;&lt;SPAN&gt;&lt;SPAN&gt;&lt;SPAN&gt;&amp;nbsp;setName, and&lt;SPAN&gt;&amp;nbsp;setNameas outlined in that answer.&amp;nbsp;&amp;nbsp; Could you please give an example of naming jobs and tasks, including the python class which is called?&amp;nbsp;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;DIV class=""&gt;Could you also confirm what should be the effect of using df.alias("example_df") in the Spark UI?&amp;nbsp;&amp;nbsp;&lt;DIV class=""&gt;I would consider this question answered with these examples:&lt;BR /&gt;- Set the JobGroup name in the Spark UI, either from the driver or the worker.&lt;DIV class=""&gt;- Set the Job description in the Spark UI, either from the driver or the worker.&lt;DIV class=""&gt;- Set the Stage description in the Spark UI, either from the driver or the worker.&lt;DIV class=""&gt;- Set the descriptions on the visual blocks in the Dag visualization pages.&lt;DIV class=""&gt;&amp;nbsp;&lt;DIV class=""&gt;The example code, before annotations are added, could look like the following:&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;from pyspark.sql import SparkSession

def stream_parquet_to_delta(s3_path, delta_table_path‌):
  # Initialize Spark session
  spark = SparkSession.builder.appName("StreamToDeltaExample").getOrCreate()

  # Read streaming data from S3 in Parquet format
  streaming_df = spark.readStream.format("parquet").load(s3_path)

  # Define the function to process each micro-batch
  def process_batch(df, batch_id‌):
     # Write the micro-batch to Delta table
     df.write.format("delta").mode("append").save(delta_table_path)

# Write streaming data to Delta table using foreachBatch
streaming_df.writeStream.foreachBatch(process_batch).start().awaitTermination()

# Example usage
stream_parquet_to_delta("s3://my-bucket/streaming-data", "/delta-table/streaming_data")&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 23 Aug 2024 09:45:35 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/naming-jobs-in-the-spark-ui-in-databricks-runtime-15-4/m-p/84046#M37116</guid>
      <dc:creator>sunnyday</dc:creator>
      <dc:date>2024-08-23T09:45:35Z</dc:date>
    </item>
    <item>
      <title>Re: Naming jobs in the Spark UI in Databricks Runtime 15.4</title>
      <link>https://community.databricks.com/t5/data-engineering/naming-jobs-in-the-spark-ui-in-databricks-runtime-15-4/m-p/139239#M51123</link>
      <description>&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;You are correct—on Databricks Runtime 15.4 and with&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;shared clusters&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;(or clusters enabled with Unity Catalog), you will see the&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;[JVM_ATTRIBUTE_NOT_SUPPORTED]&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;error when trying to directly access&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;sparkContext&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;attributes that are only available in single-user cluster modes. This means&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;sc.setJobGroup()&lt;/CODE&gt;,&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;sc.setJobDescription()&lt;/CODE&gt;, and&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;sparkSession.sparkContext.setLocalProperty()&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;are disabled in this mode. Below are your requested clarifications and alternatives.&lt;/P&gt;
&lt;HR /&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;Naming Jobs, Tasks, and Stages in Databricks 15.4&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;(Shared Cluster / Unity Catalog)&lt;/STRONG&gt;&lt;/H2&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;Job and Task Naming: What Works and What Doesn't&lt;/H2&gt;
&lt;DIV class="group relative"&gt;
&lt;DIV class="w-full overflow-x-auto md:max-w-[90vw] border-subtlest ring-subtlest divide-subtlest bg-transparent"&gt;
&lt;TABLE class="border-subtler my-[1em] w-full table-auto border-separate border-spacing-0 border-l border-t"&gt;
&lt;THEAD class="bg-subtler"&gt;
&lt;TR&gt;
&lt;TH class="border-subtler p-sm break-normal border-b border-r text-left align-top"&gt;Method&lt;/TH&gt;
&lt;TH class="border-subtler p-sm break-normal border-b border-r text-left align-top"&gt;Availability (Shared Cluster/Unity Catalog)&lt;/TH&gt;
&lt;TH class="border-subtler p-sm break-normal border-b border-r text-left align-top"&gt;How to Use / Alternatives&lt;/TH&gt;
&lt;/TR&gt;
&lt;/THEAD&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;&lt;CODE&gt;sc.setJobGroup()&lt;/CODE&gt;&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;&lt;span class="lia-unicode-emoji" title=":cross_mark:"&gt;❌&lt;/span&gt; Not Supported&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Not possible&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;&lt;CODE&gt;sc.setJobDescription()&lt;/CODE&gt;&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;&lt;span class="lia-unicode-emoji" title=":cross_mark:"&gt;❌&lt;/span&gt; Not Supported&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Not possible&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;&lt;CODE&gt;sc.setLocalProperty('spark.job.description', ...)&lt;/CODE&gt;&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;&lt;span class="lia-unicode-emoji" title=":cross_mark:"&gt;❌&lt;/span&gt; Not Supported&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Not possible&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;&lt;CODE&gt;DataFrame.alias("example_df")&lt;/CODE&gt;&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;&lt;span class="lia-unicode-emoji" title=":white_heavy_check_mark:"&gt;✅&lt;/span&gt; Supported&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;See effect below&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;&lt;CODE&gt;DataFrame.writeStream.queryName()&lt;/CODE&gt;&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;&lt;span class="lia-unicode-emoji" title=":white_heavy_check_mark:"&gt;✅&lt;/span&gt; Supported&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;See example below&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;/DIV&gt;
&lt;DIV class="bg-base border-subtler shadow-subtle pointer-coarse:opacity-100 right-xs absolute bottom-0 flex rounded-lg border opacity-0 transition-opacity group-hover:opacity-100 [&amp;amp;&amp;gt;*:not(:first-child)]:border-subtle [&amp;amp;&amp;gt;*:not(:first-child)]:border-l"&gt;
&lt;DIV class="flex"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="flex"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;HR /&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;How to Add Identifiable Names in Spark UI&lt;/H2&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;Rather than the methods that directly use&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;sparkContext&lt;/CODE&gt;, use these supported options:&lt;/P&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;1.&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;Naming Structured Streaming Queries in Spark UI&lt;/STRONG&gt;&lt;/H2&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;Use&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;.queryName()&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;to set a descriptive name for your streaming query. This will appear in the Spark UI under the “active streaming queries.”&lt;/P&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;&lt;STRONG&gt;Example&lt;/STRONG&gt;:&lt;/P&gt;
&lt;DIV class="w-full md:max-w-[90vw]"&gt;
&lt;DIV class="codeWrapper text-light selection:text-super selection:bg-super/10 my-md relative flex flex-col rounded-lg font-mono text-sm font-normal bg-subtler"&gt;
&lt;DIV class="translate-y-xs -translate-x-xs bottom-xl mb-xl flex h-0 items-start justify-end md:sticky md:top-[calc(var(--header-height)+var(--size-xs))]"&gt;
&lt;DIV class="overflow-hidden rounded-full border-subtlest ring-subtlest divide-subtlest bg-base"&gt;
&lt;DIV class="border-subtlest ring-subtlest divide-subtlest bg-subtler"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;DIV class="-mt-xl"&gt;
&lt;DIV&gt;
&lt;DIV class="text-quiet bg-subtle py-xs px-sm inline-block rounded-br rounded-tl-lg text-xs font-thin" data-testid="code-language-indicator"&gt;python&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt;&lt;CODE&gt;streaming_df&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;writeStream \
    &lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;queryName&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;"ParquetToDelta_Stream"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt; \
    &lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;foreachBatch&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;process_batch&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt; \
    &lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;start&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt; \
    &lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;awaitTermination&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;
&lt;/CODE&gt;&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;This will name the job “ParquetToDelta_Stream” in the streaming tab of Spark UI.&lt;/P&gt;
&lt;HR /&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;2.&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;Effect of&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;df.alias("example_df")&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;in Spark UI&lt;/STRONG&gt;&lt;/H2&gt;
&lt;UL class="marker:text-quiet list-disc"&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;The&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;.alias("example_df")&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;method only sets a logical alias for the DataFrame used in SQL expressions and does&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;not&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;affect job, stage, or DAG block naming in the Spark UI. It is most helpful for SQL readability and debugging optimizer plans, not for UI description.&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;HR /&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;3.&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;Block Descriptions in the DAG&lt;/STRONG&gt;&lt;/H2&gt;
&lt;UL class="marker:text-quiet list-disc"&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;There is&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;currently no direct API&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;for user-defined block descriptions in the DAG on Databricks when using shared clusters. Block/stage names are inferred from the operations (e.g., "Project", "Aggregate", "Filter").&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;For more explicit names in the UI,&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;break your code into small, well-named functions&lt;/STRONG&gt;—thereby making your code easier to correlate with the Spark UI, though the block labels themselves are not user-customizable in shared clusters.&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;HR /&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;Best Practice Example for Databricks 15.4 (Shared Cluster Mode)&lt;/H2&gt;
&lt;DIV class="w-full md:max-w-[90vw]"&gt;
&lt;DIV class="codeWrapper text-light selection:text-super selection:bg-super/10 my-md relative flex flex-col rounded-lg font-mono text-sm font-normal bg-subtler"&gt;
&lt;DIV class="translate-y-xs -translate-x-xs bottom-xl mb-xl flex h-0 items-start justify-end md:sticky md:top-[calc(var(--header-height)+var(--size-xs))]"&gt;
&lt;DIV class="overflow-hidden rounded-full border-subtlest ring-subtlest divide-subtlest bg-base"&gt;
&lt;DIV class="border-subtlest ring-subtlest divide-subtlest bg-subtler"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;DIV class="-mt-xl"&gt;
&lt;DIV&gt;
&lt;DIV class="text-quiet bg-subtle py-xs px-sm inline-block rounded-br rounded-tl-lg text-xs font-thin" data-testid="code-language-indicator"&gt;python&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt;&lt;CODE&gt;&lt;SPAN class="token token"&gt;from&lt;/SPAN&gt; pyspark&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;sql &lt;SPAN class="token token"&gt;import&lt;/SPAN&gt; SparkSession

&lt;SPAN class="token token"&gt;def&lt;/SPAN&gt; &lt;SPAN class="token token"&gt;stream_parquet_to_delta&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;s3_path&lt;SPAN class="token token punctuation"&gt;,&lt;/SPAN&gt; delta_table_path&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;:&lt;/SPAN&gt;
    spark &lt;SPAN class="token token operator"&gt;=&lt;/SPAN&gt; SparkSession&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;builder&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;appName&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;"StreamToDeltaExample"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;getOrCreate&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;
    streaming_df &lt;SPAN class="token token operator"&gt;=&lt;/SPAN&gt; spark&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;readStream&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;format&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;"parquet"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;load&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;s3_path&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;

    &lt;SPAN class="token token"&gt;def&lt;/SPAN&gt; &lt;SPAN class="token token"&gt;process_batch&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;df&lt;SPAN class="token token punctuation"&gt;,&lt;/SPAN&gt; batch_id&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;:&lt;/SPAN&gt;
        &lt;SPAN class="token token"&gt;# logic here...&lt;/SPAN&gt;
        df&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;write&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;format&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;"delta"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;mode&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;"append"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;save&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;delta_table_path&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;

    &lt;SPAN class="token token"&gt;# Set a meaningful name for the streaming query in Spark UI&lt;/SPAN&gt;
    streaming_df&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;writeStream \
      &lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;queryName&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;"ParquetToDelta_Stream"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt; \
      &lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;foreachBatch&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;process_batch&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt; \
      &lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;start&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt; \
      &lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;awaitTermination&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;

stream_parquet_to_delta&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;"s3://my-bucket/streaming-data"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;,&lt;/SPAN&gt; &lt;SPAN class="token token"&gt;"/delta-table/streaming_data"&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;
&lt;/CODE&gt;&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;UL class="marker:text-quiet list-disc"&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;This query will have the name “ParquetToDelta_Stream” in the Streaming tab of the Spark UI.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;The code structure assists with relating code blocks to the DAG plan, but block descriptions are not directly moddable in the UI under shared clusters.&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;HR /&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;Summary Table: What Is and Isn't Possible&lt;/H2&gt;
&lt;DIV class="group relative"&gt;
&lt;DIV class="w-full overflow-x-auto md:max-w-[90vw] border-subtlest ring-subtlest divide-subtlest bg-transparent"&gt;
&lt;TABLE class="border-subtler my-[1em] w-full table-auto border-separate border-spacing-0 border-l border-t"&gt;
&lt;THEAD class="bg-subtler"&gt;
&lt;TR&gt;
&lt;TH class="border-subtler p-sm break-normal border-b border-r text-left align-top"&gt;Want to Set...&lt;/TH&gt;
&lt;TH class="border-subtler p-sm break-normal border-b border-r text-left align-top"&gt;Supported? (Shared Cluster)&lt;/TH&gt;
&lt;TH class="border-subtler p-sm break-normal border-b border-r text-left align-top"&gt;How/Alternative&lt;/TH&gt;
&lt;/TR&gt;
&lt;/THEAD&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Job Group Name&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;No&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Not supported&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Job Description&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;No&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Not supported&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Stage Description&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;No&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Not supported&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Query Name (Streaming)&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Yes&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Use&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;.queryName()&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;on&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;writeStream&lt;/CODE&gt;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Table/View Name&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Yes&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Create temporary views with&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;createOrReplaceTempView()&lt;/CODE&gt;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;DataFrame Alias&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Yes&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Use&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;.alias()&lt;/CODE&gt;, but only affects SQL plans&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;</description>
      <pubDate>Sun, 16 Nov 2025 17:51:47 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/naming-jobs-in-the-spark-ui-in-databricks-runtime-15-4/m-p/139239#M51123</guid>
      <dc:creator>mark_ott</dc:creator>
      <dc:date>2025-11-16T17:51:47Z</dc:date>
    </item>
  </channel>
</rss>

