cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Delta Live Tables with CDC and Database Views with Lower Case Names

zesdatascience
New Contributor III

Hi,

I am testing out creating some Delta Live Tables using Change Data Capture and having an issue where the resulting views that are created have lower case column names. Here is my function I am using to ingest data:

def raw_to_ods_merge(table_name,source_stream,file_description,

           ods_path,primary_keys,sequence,table_prop,column_comments):

 ingest_name = table_name + '_ingest'

  

 output = source_stream \

      .withColumn('ETL_InputFile',F.input_file_name()) \

      .withColumn('ETL_LoadDate',F.lit(datetime.datetime.now()))

    

 @dlt.view( 

  name = ingest_name

 )

 def source_ingest():

  return (output)

 dlt.create_target_table(name = table_name,

             comment = file_description,

             path = ods_path,

             table_properties=table_prop,

             schema = output.schema

             )

 dlt.apply_changes(

  target = table_name,

  source = ingest_name,

  keys = primary_keys,

  sequence_by = sequence

 )

This is resulting in an __apply_change... table name with column names in mixed case as expected. However the view that gets created sets the output column names to lower case. e.g.

CREATE VIEW `raw`.`shopper_panel_korea` (

 `duration` COMMENT 'Length of time for which the metrics apply.',

 `period` COMMENT 'Date for which the metrics apply, the end of the date range.',

 `product`,

 `spend_1000000_krw`,

 `product_percent_of_category_value`,

 `retailer_percent_of_value`,

 `volume_1000_kg`,

 `product_percent_of_category_volume_1kg`,

 `retailer_percent_of_volume_1kg`,

 `buyers_1000`,

 `penetration_percent`,

 `frequency`,

 `trips_1000`,

 `performance`,

 `channel`,

 `domestic_or_import`,

 `etl_inputfile`,

 `etl_loaddate`)

TBLPROPERTIES (

 'transient_lastDdlTime' = '1651620122')

AS SELECT `Duration`,`Period`,`Product`,`Spend_1000000_KRW`,`Product_Percent_of_Category_Value`,`Retailer_Percent_of_Value`,`Volume_1000_kg`,`Product_Percent_of_Category_Volume_1kg`,`Retailer_Percent_of_Volume_1kg`,`Buyers_1000`,`Penetration_Percent`,`Frequency`,`Trips_1000`,`Performance`,`Channel`,`Domestic_or_Import`,`ETL_InputFile`,`ETL_LoadDate` FROM `raw`.`__apply_changes_storage_shopper_panel_korea` WHERE __DeleteVersion IS NULL

Is there something I am doing wrong or is this an issue?

Thanks in advance for your help,

Stu

1 ACCEPTED SOLUTION

Accepted Solutions

zesdatascience
New Contributor III

Hi @Kaniz Fatma​ 

Not found a solution just yet, but not a priority as most users will be accessing through Databricks SQL, so no further assistance required right now.

Thanks

View solution in original post

7 REPLIES 7

Kaniz_Fatma
Community Manager
Community Manager

Hi @Stuart Fish​ , We recommend following the Getting Started with Delta Live Tables which explains creating scalable and reliable pipelines using Delta Live Tables (DLT) and its declarative ETL definitions.

Kaniz_Fatma
Community Manager
Community Manager

Hi @Stuart Fish​  , Just a friendly follow-up. Do you still need help, or does the above response help you to find the solution? Please let us know.

zesdatascience
New Contributor III

Hi @Kaniz Fatma​ 

Thanks for your email. Yes, I did look through the article shared. I am set up with Delta Live Tables and pipelines are working OK generally.

It was really just the one issue mentioned in the post with the lower case column names in the views when view through the Hive catalog. It was wondering whether this is a defect with the view creation which is handled automatically through Delta Live tables? Not sure other than this, where I would log that though?

One thing I did notice was it looks fine through Databricks SQL. Unless it is something to do with the runtime version I was using, I'll check I am running on the latest and see if the issue is still there.

Thanks for your help,

Stu

Hi @Stuart Fish​ , Thank you for the update. Do reach out in case of any other issues.

Kaniz_Fatma
Community Manager
Community Manager

Hi @Stuart Fish​, Just a friendly follow-up. Do you still need help, or does the above response help you to find the solution? Please let us know.

zesdatascience
New Contributor III

Hi @Kaniz Fatma​ 

Not found a solution just yet, but not a priority as most users will be accessing through Databricks SQL, so no further assistance required right now.

Thanks

Kaniz_Fatma
Community Manager
Community Manager

Hi @Stuart Fish​ ​, I was checking back to see if you have a resolution yet. If you have any solution, please share it with the community as it can be helpful to others. Otherwise, we will respond with more details and try to help.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group