cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Delta Table Merge statement is not accepting broadcast hint

Chalki
New Contributor III

I have a statement like this with pyspark:

target_tbl.alias("target")\

            .merge(stage_df.hint("broadcast").alias("source"), merge_join_expr)\

                .whenMatchedUpdateAll()\

                .whenNotMatchedInsertAll()\

                .whenNotMatchedBySourceDelete()\

                .execute()

It is not accepting the broadcast hint. I am getting the following:

Join hint ignored: This query has a join hint '(strategy=broadcast)' that is not associated with any join operator and will thus be ignored. Investigate the query to see if the hint is placed correctly.

In this video:

https://www.youtube.com/watch?v=o2k9PICWdx0&t=797s

it is said that this approach is working

1 ACCEPTED SOLUTION

Accepted Solutions

Anonymous
Not applicable

@Nikolay Chalkanovโ€‹ :

The error message indicates that the join hint is not associated with any join operator. This can happen if the hint is not placed correctly, or if there is no join operator in the query that can take advantage of the hint.

In the code you posted, it seems that you are using the merge method to perform the join operation. According to the Delta Lake documentation, the merge method does not support broadcast hints:

https://docs.delta.io/latest/delta-update.html#joins

Therefore, it is expected that the hint is being ignored.

The video you mentioned might be using a different method to perform the join operation, which supports broadcast hints. You could try using the join method instead of merge, and see if the broadcast hint is accepted:

target_tbl.alias("target")\
    .join(stage_df.hint("broadcast").alias("source"), join_expr, "inner")\
    .write.format("delta").mode("overwrite").save("output_delta")

Keep in mind that using broadcast hints can have performance implications, and it might not always be the best option depending on the size of your data and the resources available.

View solution in original post

2 REPLIES 2

Anonymous
Not applicable

@Nikolay Chalkanovโ€‹ :

The error message indicates that the join hint is not associated with any join operator. This can happen if the hint is not placed correctly, or if there is no join operator in the query that can take advantage of the hint.

In the code you posted, it seems that you are using the merge method to perform the join operation. According to the Delta Lake documentation, the merge method does not support broadcast hints:

https://docs.delta.io/latest/delta-update.html#joins

Therefore, it is expected that the hint is being ignored.

The video you mentioned might be using a different method to perform the join operation, which supports broadcast hints. You could try using the join method instead of merge, and see if the broadcast hint is accepted:

target_tbl.alias("target")\
    .join(stage_df.hint("broadcast").alias("source"), join_expr, "inner")\
    .write.format("delta").mode("overwrite").save("output_delta")

Keep in mind that using broadcast hints can have performance implications, and it might not always be the best option depending on the size of your data and the resources available.

Anonymous
Not applicable

Hi @Nikolay Chalkanovโ€‹ 

Thank you for posting your question in our community! We are happy to assist you.

To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your question?

This will also help other community members who may have similar questions in the future. Thank you for your participation and let us know if you need any further assistance! 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group