cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Delta Live Table Schema Comment

Dave_Nithio
Contributor

I predefined my schema for a Delta Live Table Autoload. This included comments for some attributes. When performing a standard readStream, my comments appear, but when in Delta Live Tables I get no comments. Is there anything I need to do get comments to appear?

Schema definition:

schema = StructType([
  StructField("uuid",StringType(),True, {'comment': "Unique customer id"}),
  StructField("GPS",StringType(),True)])

Delta Live Table Stream:

@dlt.table(name="test_bronze",
                  comment = "test account data incrementally ingested from S3 Raw landing zone",
  table_properties={
    "quality": "bronze"
  }
)
 
# Stream data
#@dlt.table
def test_bronze():
  return (
    spark.readStream
                  .format("cloudFiles")
                  .option("cloudFiles.format", "csv)
                  .option("header", "True")
                  .schema(schema)
                  .load(data_source)
  )

But no comments in data:

image

1 ACCEPTED SOLUTION

Accepted Solutions

Hubert-Dudek
Esteemed Contributor III

You need to add your schema to dlt declaration:

@dlt.table(
   name="test_bronze",
   comment = "test account data incrementally ingested from S3 Raw landing zone",
   table_properties={ "quality": "bronze" },
   schema=schema)

View solution in original post

4 REPLIES 4

Debayan
Esteemed Contributor III

Hi @Dave Wilson​ , are you getting any error for the same?

You can include comments.

Delta Live Tables automatically captures the dependencies between datasets defined in your pipeline and uses this dependency information to determine the execution order when performing an update and to record lineage information in the event log for a pipeline.

Both views and tables have the following optional properties:

  • COMMENT: A human-readable description of this dataset.

Please refer https://docs.databricks.com/workflows/delta-live-tables/delta-live-tables-sql-ref.html#sql-datasets

Hubert-Dudek
Esteemed Contributor III

You need to add your schema to dlt declaration:

@dlt.table(
   name="test_bronze",
   comment = "test account data incrementally ingested from S3 Raw landing zone",
   table_properties={ "quality": "bronze" },
   schema=schema)

Trodenn
New Contributor III

what does adding table_properties do again? any links to the documentation?

Hubert-Dudek
Esteemed Contributor III

table_properties are optional parameters that you can use to configure various aspects of your Delta Live Tables, such as optimizationpartitioning, and retention and also set own custom tags.

You can find more details about table_properties and their possible values in Table properties - https://learn.microsoft.com/en-us/azure/databricks/workflows/delta-live-tables/dlt-table-properties

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group