- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-18-2024 06:50 AM - edited 06-19-2024 01:26 AM
I am trying to create a simple dlt pipeline:
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-19-2024 02:36 AM - edited 06-19-2024 02:37 AM
I only just noticed you are using DLT. My bad.
The @Dlt.table decorator tells DLT to create a table that contains the result of a DataFrame.
Basically, you can't operate on the result of the function as you're used to operating on a DataFrame, but you need to operate on the DLT table it created, using dlt.read(<table_name>). If you want to do DataFrame operations on the table you've created, you need to use dlt.read(<table_name>).count()
Example:
@Dlt.table
def test():
if dlt.read("today_latest_execution").count() >= 0:
return dlt.read("today_latest_execution")
DLT works a lot differently than what you're used to with working with function return values.
Hope this helps!
Edit: argh, somehow my post keeps tagging user Dlt haha but I think you get the point!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-18-2024 07:10 AM
can you try count() instead of count (without brackets)?
PS. a dataframe is a dataset of type row.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-18-2024 11:01 AM
You're missing the parenthesis: count()
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-19-2024 01:27 AM - edited 06-19-2024 01:28 AM
@jacovangelder @-werners- , yes yes, it has () there, sorry, copied the code wrongly
error is still the same though 😞
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-19-2024 02:36 AM - edited 06-19-2024 02:37 AM
I only just noticed you are using DLT. My bad.
The @Dlt.table decorator tells DLT to create a table that contains the result of a DataFrame.
Basically, you can't operate on the result of the function as you're used to operating on a DataFrame, but you need to operate on the DLT table it created, using dlt.read(<table_name>). If you want to do DataFrame operations on the table you've created, you need to use dlt.read(<table_name>).count()
Example:
@Dlt.table
def test():
if dlt.read("today_latest_execution").count() >= 0:
return dlt.read("today_latest_execution")
DLT works a lot differently than what you're used to with working with function return values.
Hope this helps!
Edit: argh, somehow my post keeps tagging user Dlt haha but I think you get the point!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-19-2024 02:42 AM
glad I work in scala and do no have to deal with DLT 😄
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-19-2024 02:44 AM
Not a fan myself either! It seems DLT is getting a big rebrand with LakeFlow around the corner. In my experience DLT was never that widely adopted.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-19-2024 02:14 AM
what if you do:
return spark.sql("SELECT * FROM LIVE.last_execution").toDF()

