DLT streaming table showing more "Written records" than its actually writing to table
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-02-2024 06:37 AM
Hi!
I have a DLT setup streaming data from incoming parquet files, into bronze, silver and gold tables. There is a strange bug where in the Graph gui, the number of written records for the gold streaming-table is far greater than the actual data that is written to it. I'm doing a downsampling operation on the timeseries data (for testing) when streaming to the gold table. (1 minute intervals)
Also: I wonder if there's anything wrong with how i give Auto Loader the path to the data:
data_path = "/Volumes/fc_lake_catalog/test_external_schema/test_external_volume/2024/*/*"
the parquet files gets written ADLS Gen2 with EventHub in this kind of structure: 2024/08/04/23/59/*.parquet.
Labels:
- Labels:
-
Delta Lake
-
Spark
1 REPLY 1
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-02-2024 06:51 AM
Also after running this for a while i get these errors:

