What does "Command exited with code 50 mean" and how do you solve it?

fuselessmatt
Contributor

Hi!

We have this dbt model that generates a table with user activity in the previous days, but we get this vague error message in the Databricks SQL Warehouse.

Job aborted due to stage failure: Task 3 in stage 4267.0 failed 4 times, most recent failure: Lost task 3.3 in stage 4267.0 (TID 41247) (<some ip_address> executor 18): ExecutorLostFailure (executor 18 exited caused by one of the running tasks) Reason: Command exited with code 50

Driver stacktrace:

The stacktrace is empty and I can't find anyone else having this exact problem on the internet.

This model was migrated from Redshift and the only thing we changed were the dateadd format. So it should be valid SQL.

< on p.activity_date between dateadd(day, -1, d.end_date)

---

> on p.activity_date between date_add('day', -1, d.end_date)

The model rarely works as part of the daily run, but often seems to work if you try to rerun it. I'm wondering if this implies that it is some sort of internal Databricks errors caused by stack overflow or memory issues.

I have a query profile and it seems it is failing a the big "Columnar To Row, Nested Loop Join, Hash Aggregate, Hash Aggregate" "(Whole Stage Codegen fuses multiple operators together to optimize query performance. Metrics represent all operators combined)"