โ05-10-2024 09:25 AM
Hi,
I'm trying to set a dynamic value to use in a DLT query, and the code from the example documentation does not work.
SET startDate='2020-01-01';
CREATE OR REFRESH LIVE TABLE filtered
AS SELECT * FROM my_table
WHERE created_at > ${startDate};
It is getting an error because it plugs in an empty value for the variable in the query.
[PARSE_SYNTAX_ERROR] Syntax error at or near end of input. SQLSTATE: 42601 (line 3, pos 19)
== SQL ==
CREATE OR REFRESH LIVE TABLE filtered
AS SELECT * FROM my_table
WHERE created_at >
-------------------^^^
It seems like the documentation is wrong. Is there a different way to do this?
Thanks!
โ05-16-2024 12:34 PM
Hi @MarkD how are you?
The `SET` command alone is typically used to change the settings of a session, such as configuration parameters.
To define a variable inside a SQL statement, you can use SET VAR <variablename> = <variablevalue> or SET VARIABLE <variablename> = <variablevalue> to define, as on the example below:
-- Set a SQL variable to a value using SET VAR > SET VAR myvar1 = 5; > VALUES (myvar1); 5 -- Set a SQL variable to 7 using the SET VARIABLE command > SET VARIABLE myvar1 = 7; > VALUES (myvar1); 7
Please relate to the SET VAR documentation for further reference: https://docs.databricks.com/ja/sql/language-manual/sql-ref-syntax-aux-set-variable.html
Take care!
โ05-16-2024 12:42 PM
The example I gave is directly from Databrick's DLT documentation, but as you say does not appear the be the right way to set a variable and doesn't work.
Unfortunately SET VAR is not supported in Delta Live Tables.
โ05-16-2024 01:39 PM
For DLT pipelines, please use the "Advanced" panel, on the pipeline menu, to set your variables:
That will set up a new configuration value that can be referenced inside your code as a variable:
CREATE OR REFRESH LIVE TABLE filtered
AS SELECT * FROM my_table
WHERE created_at > ${startDate};
Would this solve your issue? Also, can you please share the documentation from DLT that states that the SET command can be used to create variables?
Alessandro
โ05-16-2024 02:33 PM
I've used the advanced configuration option for other variables, but in this case I wanted something dynamic. I want to `SET startDate = current_time()` and then use `startDate` through all my queries to make sure they are all using the same time comparison in their queries.
Here's the documentation that shows using "SET" as I did in the initial question. https://docs.databricks.com/en/delta-live-tables/sql-ref.html#set-configuration-values-for-a-table-o...
โ05-16-2024 05:12 PM
@MarkD I beleve that you don't need use a var like that.
What do you think about use only a function current_time(), like that:
CREATE OR REFRESH LIVE TABLE filtered
AS SELECT * FROM my_table
WHERE created_at > current_time();
โ05-16-2024 06:18 PM
The reason I want to use a variable is so that I can use the same cut off time in multiple queries to make sure to make sure that I don't get records in one query with a later date than another.
For example
CREATE OR REFRESH LIVE TABLE filtered_one
AS SELECT * FROM my_table_one
WHERE created_at > current_time();
CREATE OR REFRESH LIVE TABLE filtered_two
AS SELECT * FROM my_table_two
WHERE created_at > current_time();
If I want to compare records in `filtered_one` to the results in `filtered_two` I can't rely on them having used the same value for `current_time()`.
โ05-17-2024 10:16 AM
Have you considered use checkpoint?
If you do this maybe empty dataset comes becouse there are no date in future... created_at > current_time().
Read more about [Change Data Feed] Enable it in your base table and then capture only new changes feed.
โ05-17-2024 06:29 AM
Hi @MarkD ,
You may use
set variable_name.var= '1900-01-01'
to set the value of variable and in order to use the value of variable
use
${automated_date.var}
Example:
set automated_date.var= '1800-01-01'
select * from my table where date = CAST(${automated_date.var} as DATE)
6 hours ago
@anardinelli Can you please help with a solution? I am am having issue with setting a variable in delta live table pipeline and use it with APPLY CHANGES INTO syntax.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group