cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Delta Live Table Schema Comment

Dave_Nithio
Contributor

I predefined my schema for a Delta Live Table Autoload. This included comments for some attributes. When performing a standard readStream, my comments appear, but when in Delta Live Tables I get no comments. Is there anything I need to do get comments to appear?

Schema definition:

schema = StructType([
  StructField("uuid",StringType(),True, {'comment': "Unique customer id"}),
  StructField("GPS",StringType(),True)])

Delta Live Table Stream:

@dlt.table(name="test_bronze",
                  comment = "test account data incrementally ingested from S3 Raw landing zone",
  table_properties={
    "quality": "bronze"
  }
)
 
# Stream data
#@dlt.table
def test_bronze():
  return (
    spark.readStream
                  .format("cloudFiles")
                  .option("cloudFiles.format", "csv)
                  .option("header", "True")
                  .schema(schema)
                  .load(data_source)
  )

But no comments in data:

image

1 ACCEPTED SOLUTION

Accepted Solutions

Hubert-Dudek
Esteemed Contributor III

You need to add your schema to dlt declaration:

@dlt.table(
   name="test_bronze",
   comment = "test account data incrementally ingested from S3 Raw landing zone",
   table_properties={ "quality": "bronze" },
   schema=schema)

View solution in original post

5 REPLIES 5

Debayan
Esteemed Contributor III
Esteemed Contributor III

Hi @Dave Wilson​ , are you getting any error for the same?

You can include comments.

Delta Live Tables automatically captures the dependencies between datasets defined in your pipeline and uses this dependency information to determine the execution order when performing an update and to record lineage information in the event log for a pipeline.

Both views and tables have the following optional properties:

  • COMMENT: A human-readable description of this dataset.

Please refer https://docs.databricks.com/workflows/delta-live-tables/delta-live-tables-sql-ref.html#sql-datasets

Hubert-Dudek
Esteemed Contributor III

You need to add your schema to dlt declaration:

@dlt.table(
   name="test_bronze",
   comment = "test account data incrementally ingested from S3 Raw landing zone",
   table_properties={ "quality": "bronze" },
   schema=schema)

Trodenn
New Contributor III

what does adding table_properties do again? any links to the documentation?

Hubert-Dudek
Esteemed Contributor III

table_properties are optional parameters that you can use to configure various aspects of your Delta Live Tables, such as optimizationpartitioning, and retention and also set own custom tags.

You can find more details about table_properties and their possible values in Table properties - https://learn.microsoft.com/en-us/azure/databricks/workflows/delta-live-tables/dlt-table-properties

Kaniz
Community Manager
Community Manager

Hi @Dave Wilson​ ​, We haven’t heard from you since the last response from @Debayan Mukherjee​ and @Hubert Dudek​, and I was checking back to see if you have a resolution yet.

If you have any solution, please share it with the community as it can be helpful to others. Otherwise, we will respond with more details and try to help.

Also, Please don't forget to click on the "Select As Best" button whenever the information provided helps resolve your question.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.