cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

What does "Command exited with code 50 mean" and how do you solve it?

fuselessmatt
Contributor

Hi!

We have this dbt model that generates a table with user activity in the previous days, but we get this vague error message in the Databricks SQL Warehouse.

Job aborted due to stage failure: Task 3 in stage 4267.0 failed 4 times, most recent failure: Lost task 3.3 in stage 4267.0 (TID 41247) (<some ip_address> executor 18): ExecutorLostFailure (executor 18 exited caused by one of the running tasks) Reason: Command exited with code 50

Driver stacktrace:

The stacktrace is empty and I can't find anyone else having this exact problem on the internet.

This model was migrated from Redshift and the only thing we changed were the dateadd format. So it should be valid SQL.

< on p.activity_date between dateadd(day, -1, d.end_date)

---

> on p.activity_date between date_add('day', -1, d.end_date)

The model rarely works as part of the daily run, but often seems to work if you try to rerun it. I'm wondering if this implies that it is some sort of internal Databricks errors caused by stack overflow or memory issues.

I have a query profile and it seems it is failing a the big "Columnar To Row, Nested Loop Join, Hash Aggregate, Hash Aggregate" "(Whole Stage Codegen fuses multiple operators together to optimize query performance. Metrics represent all operators combined)"

1 ACCEPTED SOLUTION

Accepted Solutions

fuselessmatt
Contributor

We haven't been able to figure out the exact cause, but we found solution around it. If you precalculate the datediff of the joins you don't get this error and the query runs significantly faster.

    inner join dates d 
        on p.activity_date between dateadd(day, -7, d.end_date) 
        AND d.end_date
    inner join dates d 
        on p.activity_date between d.end_date_m_7
        AND d.end_date

I'm suspecting it has something to do with distributing data and that it does it in a smarter way when it already has the result of dateadd(day, -7, d.end_date) . Maybe it doesn't realise that it will be the same for each day.

We're running a medium SQL Pro warehouse with cost optimised spot policy. I don't see the version, but I guess it is the current one for 2/3 2023

View solution in original post

5 REPLIES 5

fuselessmatt
Contributor

We haven't been able to figure out the exact cause, but we found solution around it. If you precalculate the datediff of the joins you don't get this error and the query runs significantly faster.

    inner join dates d 
        on p.activity_date between dateadd(day, -7, d.end_date) 
        AND d.end_date
    inner join dates d 
        on p.activity_date between d.end_date_m_7
        AND d.end_date

I'm suspecting it has something to do with distributing data and that it does it in a smarter way when it already has the result of dateadd(day, -7, d.end_date) . Maybe it doesn't realise that it will be the same for each day.

We're running a medium SQL Pro warehouse with cost optimised spot policy. I don't see the version, but I guess it is the current one for 2/3 2023

Anonymous
Not applicable

Hi @Mattias Pā€‹ 

Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.

Please help us select the best solution by clicking on "Select As Best" if it does.

Your feedback will help us ensure that we are providing the best possible service to you.

Thank you!

But there is no solution, only my own work around?

shan_chandra
Databricks Employee
Databricks Employee

@Mattias Pā€‹  - For the executor lost failure, is it trying to bring in large data volume? can you please reduce the date range and try? or run the workload on a bigger DBSQL warehouse than the current one.

Reducing the data volume works, sadly we need that exact logic and we have that users. However, calculating the join condition works as I mentioned in my own reply in this thread

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonā€™t want to miss the chance to attend and share knowledge.

If there isnā€™t a group near you, start one and help create a community that brings people together.

Request a New Group