cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

On the fly transformations on DLT tables

Mathias_Peters
New Contributor II

Hi, 

I am loading data from a kinesis data stream using DLT. 

CREATE STREAMING TABLE Consumers_kinesis_2
(
...,
unbase64(data) String,
...
)
AS
SELECT * FROM STREAM read_kinesis (...)
 
Is it possible to directly cast, unbase64, and/or transform the resulting attributes before writing them into the streaming table? I'd like to interpret the data aatribute of the kinesis stream which is binary but contains JSON.
 
best and thanks in advance,
Mathias 
 
 
1 REPLY 1

Kaniz
Community Manager
Community Manager

Hi @Mathias_PetersWhen working with Amazon Kinesis Data Analytics, you can indeed transform data before writing it into a streaming table.

Let’s explore some options:

  1. Unbase64 Transformation:

    • To decode Base64-encoded data, you can use the unbase64 function. This function converts a Base64-encoded string into its original binary representation.
    • For example, in your SQL query:
      CREATE STREAMING TABLE Consumers_kinesis_2
      (
          ...,
          unbase64(data) String,
          ...
      )
      AS
      SELECT * FROM STREAM read_kinesis (...)
      
      The unbase64(data) expression will decode the Base64 data from the data attribute in your Kinesis stream.
  2. Additional Transformations:

    • If your decoded data is in JSON format, you can further manipulate it using other SQL functions. For instance:
      • Extract specific fields from the JSON using JSON_VALUE or JSON_PARSE.
      • Cast the extracted values to appropriate data types (e.g., integers, strings, etc.).
      • Perform any necessary data cleansing or enrichment.
    • Here’s an example of extracting a specific field from the JSON:
      SELECT
          JSON_VALUE(unbase64(data), '$.myField') AS myField
      FROM
          Consumers_kinesis_2;
      
      Replace '$.myField' with the actual path to the desired field within your JSON data.
  3. Preprocessing with AWS Lambda:

Remember to adjust the examples above based on your specific use case and the structure of your data. Happy streaming! 😊

If you have any further questions or need additional assistance, feel free to ask! 🚀

 
Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.