I'm trying to perform a MERGE between two tables (customers and customers_update) using Spark SQL, but I’m encountering an internal error during the planning phase. The error message suggests it might be a bug in Spark or one of the plugins in use.
Here’s the SQL code I’m running:
```
MERGE INTO customers AS c USING customers_update AS u ON c.customer_id = u.customer_id WHEN MATCHED AND c.email IS NULL AND u.email IS NOT NULL THEN UPDATE SET email = u.email WHEN NOT MATCHED THEN INSERT (customer_id, email, profile, updated) VALUES (u.customer_id, u.email, u.profile, u.updated);
```
And the error message:
```
[INTERNAL_ERROR] The Spark SQL phase planning failed with an internal error. You hit a bug in Spark or the Spark plugins you use. Please, report this bug to the corresponding communities or vendors, and provide the full stack trace. SQLSTATE: XX000
```
As a workaround, I modified the code to specify the the database and metastore for the table, which resolved the issue:
```
MERGE INTO hive_metastore.default.customers AS c
USING customers_update AS u
ON c.customer_id = u.customer_id
WHEN MATCHED AND c.email IS NULL AND u.email IS NOT NULL THEN
UPDATE SET email = u.email
WHEN NOT MATCHED THEN
INSERT (customer_id, email, profile, updated)
VALUES (u.customer_id, u.email, u.profile, u.updated);
```
I would like to comprehend the necessity of the aforementioned approach in resolving the issue and inquire whether any configuration could avert such errors in the future. Furthermore, is there an established solution for this bug since it remained unresolved following the recommendations from an AI assistant?
Rafael Sousa