cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Change DLT table type from streaming to 'normal'

SamGreene
Contributor

I have a DLT streaming live table, and after watching a QA session, I saw that it is advised to only use streaming tables for your raw landing.  I attempted to modify my pipeline to have my silver table be a regular LIVE TABLE, but an error was thrown, saying that a streaming live table exists with the same name. Is there a procedure to handle this?  

The error message: 

org.apache.spark.sql.AnalysisException: Could not materialize 'ami.dev.rni_alarms' because a STREAMING_TABLE table already exists with that name.

6 REPLIES 6

Wojciech_BUK
Valued Contributor III

Please check your code as probably you declared your table "rni_alarms" twice in same DLT pipeline, even in separate notebooks.
Please remember that this is still ok to have streming tables between e.g. Bronze and Silver, it all depeneds on use case.

Thanks for your advice.  I have two statements in the notebook regarding these tables and no references later.  Here is the code for reference, maybe I have something wrong here...

 
/*Alarm Data */
CREATE OR REFRESH STREAMING TABLE RNI_Alarms_Raw(
CONSTRAINT timestamp_is_not_null EXPECT (TimeStamp IS NOT NULL),
CONSTRAINT meterid_is_not_null EXPECT (MeterID IS NOT NULL),
CONSTRAINT commodity_is_not_null EXPECT (Commodity IS NOT NULL)
)
TBLPROPERTIES ("quality" = "bronze")
AS
SELECT *, _metadata.file_path as file_path
FROM STREAM read_files(
  '/Volumes/ami/landing/rni/Alarm/', format => 'CSV', schema => 'RecordType string, RecordVersion string, SenderID string, SenderCustomerID string,ReceiverID string,ReceiverCustomerID string,TimeStamp string,MeterID string,Purpose string,Commodity string,Units string,CalculationConstant int,Interval int,Count int,Data1 string,Data2 string,Data3 string');

/* 2 */
CREATE OR REFRESH LIVE TABLE RNI_Alarms(
  CONSTRAINT timestamp_is_not_null EXPECT (TimeStamp IS NOT NULL),
  CONSTRAINT meterid_is_not_null EXPECT (MeterID IS NOT NULL),
  CONSTRAINT commodity_is_not_null EXPECT (Commodity IS NOT NULL),
  CONSTRAINT alarm_code_is_valid EXPECT (AlarmID < 65) ON VIOLATION DROP ROW
)
TBLPROPERTIES ("quality" = "silver")
AS
SELECT
  RecordType,
  RecordVersion,
  SenderID,
  SenderCustomerID,
  ReceiverID,
  ReceiverCustomerID,
  CAST(TO_TIMESTAMP(TimeStamp, 'yyyyMMddHHmm') AS timestamp) as TimeStamp,
  MeterID,
  Purpose,
  Commodity,
  Units,
  CalculationConstant,
  Interval,
  Count,
  CAST(TO_TIMESTAMP(Data1, 'yyyyMMddHHmm') AS timestamp) as AlarmTimestamp,
  Data2 AlarmCodeSet,
  Data3 AlarmID
FROM LIVE.RNI_Alarms_Raw
where CAST(TO_TIMESTAMP(Data1, 'yyyyMMddHHmm') AS timestamp) > '2023-12-01 00:00:00';

 Regarding streaming being ok in some cases, this scenario is probably one of them as the data is append-only, and I'm not creating aggregates, but I wanted to see how to make the change on something simple. 

Wojciech_BUK
Valued Contributor III

hmmm,
can you check your schema if there is table that already exist with this name ? 

your error definietely says it can't create materialized view for you because you have streming table already in place.

Maybe you have created another DLT pipeline that points to same schema?

To validate it, you can change name of table in your current piepeline to e.g. rni_alarms_2 
or you can point dlt pipeline to other schema like dev_2.

in DLT world you can;t drop streaming table, you have to revome it from DLT pipeline and run it ๐Ÿ™‚ 

I had the same thought and cleaned up some pipelines from 'v1' I made, which was writing to another set of schemas. No lunch. 

I looked around the table properties and found something odd in the lineage. When I click the links to the supposed dependencies, it doesn't jump to another table, so I think there is somehow a self-reference.  This was the first live table I set up last week, so I probably discovered some issues by abusing it. I'm giving the new schema in the pipeline a test now.

SamGreene_0-1703205751587.png

 

SamGreene
Contributor

Pointing to the dev2 schema and then back to dev worked to clear things up.  An unpleasant side effect is that all comments and descriptions are lost from the objects. 

quakenbush
Contributor

Just curious, could you point me to said QA session if it's a video or something? I'm not aware of such a limitation. You can use DLT's live streaming tables anywhere in the Medallion architecture, just make sure not to break stream composability by using Materliazed Views, probably in the Golden Layer if update/merge/delete is required.

I'm asking because I recently completed Fundamental/Advanced DE certification and I'm currently working on the Professional certification. Either I was a sleep (those videos aren't my favorite medium) or I simply never heard of this recommendation in training... 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group