Re: Empty Streaming tables in dlt

ashraf1395 · ‎12-02-2024

Hi there @Alberto_Umana ,

I tried this approach but I am getting this error. It is able to create a streaming table but when a schema is specified it fails.

I am getting this error

com.databricks.pipelines.common.errors.DLTAnalysisException: Table 'bb1123_loans' has a user-specified schema that is incompatible with the schema
 inferred from its query.
"
Streaming tables are stateful and remember data that has already been
processed. If you want to recompute the table from scratch, please full refresh
the table.
              

Declared schema:
root
 |-- loan_number: string (nullable = false)
 |-- loan_bal: decimal(38,18) (nullable = true)
 |-- cust_number: string (nullable = true)
 |-- cust_nm: string (nullable = true)
 |-- cust_addr: string (nullable = true)


Inferred schema:
root
 |-- timestamp: timestamp (nullable = true)
 |-- value: long (nullable = true)
at com.databricks.pipelines.graph.DataflowGraph.$anonfun$validationFailure$22(DataflowGraph.scala:984)
at lang.Thread.run(Thread.java:750)

This is my code

            # Case 3: Empty table with a predefined schema
            if not schema:
                raise ValueError("Schema must be provided for empty tables without input_path or DataFrame.")
            
            @dlt.table(name=table_name,schema = schema)
            def empty_table():
                log_event(logger,f"Creating empty DLT streaming table: {table_name}.", "INFO")
                return (
                    spark.readStream.format("rate").load()
                )