it is possible to create empty streaming tables in Delta Live Tables (DLT) with only the schema specified using the DLT Python API. Here is how you can do it:
- Import the DLT module:
import dlt
- Define the schema:
You can specify the schema using a Python StructType or a SQL DDL string. Here is an example using a Python StructType:
from pyspark.sql.types import StructType, StructField, StringType, LongType
sales_schema = StructType([
StructField("customer_id", StringType(), True),
StructField("customer_name", StringType(), True),
StructField("number_of_line_items", StringType(), True),
StructField("order_datetime", StringType(), True),
StructField("order_number", LongType(), True)
])
- Create the streaming table:
Use the @Dlt.table decorator to define the streaming table and specify the schema. Here is an example:
@Dlt.table(
name="sales",
comment="Raw data on sales",
schema=sales_schema
)
def sales():
return spark.readStream.format("rate").load() # Placeholder for the actual streaming source
In this example, the sales function defines a streaming table with the specified schema. The spark.readStream.format("rate").load() is a placeholder for the actual streaming source
Please refer to: https://docs.databricks.com/en/delta-live-tables/python-ref.html