- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
ā03-01-2023 01:07 AM
Hi!
We have this dbt model that generates a table with user activity in the previous days, but we get this vague error message in the Databricks SQL Warehouse.
Job aborted due to stage failure: Task 3 in stage 4267.0 failed 4 times, most recent failure: Lost task 3.3 in stage 4267.0 (TID 41247) (<some ip_address> executor 18): ExecutorLostFailure (executor 18 exited caused by one of the running tasks) Reason: Command exited with code 50
Driver stacktrace:
The stacktrace is empty and I can't find anyone else having this exact problem on the internet.
This model was migrated from Redshift and the only thing we changed were the dateadd format. So it should be valid SQL.
< on p.activity_date between dateadd(day, -1, d.end_date)
---
> on p.activity_date between date_add('day', -1, d.end_date)
The model rarely works as part of the daily run, but often seems to work if you try to rerun it. I'm wondering if this implies that it is some sort of internal Databricks errors caused by stack overflow or memory issues.
I have a query profile and it seems it is failing a the big "Columnar To Row, Nested Loop Join, Hash Aggregate, Hash Aggregate" "(Whole Stage Codegen fuses multiple operators together to optimize query performance. Metrics represent all operators combined)"
- Labels:
-
Bug Report
-
Dateadd
-
Error Code
-
Error Message
-
Join
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
ā03-02-2023 03:00 AM
We haven't been able to figure out the exact cause, but we found solution around it. If you precalculate the datediff of the joins you don't get this error and the query runs significantly faster.
inner join dates d
on p.activity_date between dateadd(day, -7, d.end_date)
AND d.end_date
inner join dates d
on p.activity_date between d.end_date_m_7
AND d.end_date
I'm suspecting it has something to do with distributing data and that it does it in a smarter way when it already has the result of dateadd(day, -7, d.end_date) . Maybe it doesn't realise that it will be the same for each day.
We're running a medium SQL Pro warehouse with cost optimised spot policy. I don't see the version, but I guess it is the current one for 2/3 2023
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
ā03-02-2023 03:00 AM
We haven't been able to figure out the exact cause, but we found solution around it. If you precalculate the datediff of the joins you don't get this error and the query runs significantly faster.
inner join dates d
on p.activity_date between dateadd(day, -7, d.end_date)
AND d.end_date
inner join dates d
on p.activity_date between d.end_date_m_7
AND d.end_date
I'm suspecting it has something to do with distributing data and that it does it in a smarter way when it already has the result of dateadd(day, -7, d.end_date) . Maybe it doesn't realise that it will be the same for each day.
We're running a medium SQL Pro warehouse with cost optimised spot policy. I don't see the version, but I guess it is the current one for 2/3 2023
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
ā03-21-2023 11:15 PM
Hi @Mattias Pā
Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.
Please help us select the best solution by clicking on "Select As Best" if it does.
Your feedback will help us ensure that we are providing the best possible service to you.
Thank you!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
ā03-27-2023 08:50 AM
But there is no solution, only my own work around?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
ā03-29-2023 10:11 AM
@Mattias Pā - For the executor lost failure, is it trying to bring in large data volume? can you please reduce the date range and try? or run the workload on a bigger DBSQL warehouse than the current one.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
ā03-30-2023 08:54 AM
Reducing the data volume works, sadly we need that exact logic and we have that users. However, calculating the join condition works as I mentioned in my own reply in this thread