cancel
Showing results for 
Search instead for 
Did you mean: 
Community Platform Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results for 
Search instead for 
Did you mean: 

Identity column has null values

rachit-prodigal
New Contributor

I created a table in databricks using a dbt model pre hook 

 

CREATE TABLE IF NOT EXISTS accounts (
account_id BIGINT GENERATED ALWAYS AS IDENTITY,
description STRING
other columns
)

I use the same dbt model to merge values into this table in the post-hook

 

post_hook = [
"
MERGE INTO accounts AS target
USING incremental_accounts AS source
ON <join-condition>
WHEN MATCHED THEN UPDATE SET
target.description = source.description
WHEN NOT MATCHED BY target THEN
INSERT(description)
VALUES (source.description)
"

the model code itself is simple in the file incremental_accounts.sql

SELECT description from new_accounts

The table accounts was created by this model and only updated by this model by the post-hook queries.

The problem is that I noticed null values in account_id which should not be the case as it is auto-generated. 

I identified the same issue with 4 other tables which were built similarly using dbt models.

 

Can you please help me with this issue. 

 

1 REPLY 1

Kaniz_Fatma
Community Manager
Community Manager

Hi @rachit-prodigal

  • Make sure that the account_id column is correctly defined as an identity column in your dbt model. Verify that the GENERATED ALWAYS AS IDENTITY clause is present in the model definition.
  • Confirm that the model is using the correct data type for the account_id column (e.g., BIGINT).
  • Review the post-hook query carefully. Ensure that the MERGE INTO statement is correctly joining the accounts and incremental_accounts tables based on the specified <join-condition>.
  • Double-check the logic for updating and inserting records. The WHEN MATCHED THEN UPDATE clause should correctly update the description field, and the WHEN NOT MATCHED BY target THEN INSERT clause should insert new records.
  • Identity columns are automatically populated by the database system. If you’re seeing null values, it’s possible that there’s an issue with how the identity column is being handled.
  • Check if there are any triggers or other processes that might interfere with the identity column generation during the merge operation.
  • Temporarily modify the post-hook to include additional logging or print statements. This will help you understand what’s happening during the merge process.
  • For example, you can add a SELECT statement after the MERGE INTO to see the results of the merge operation.
  • Create a simplified dbt model with just the account_id and description columns. Use a basic INSERT statement in the post-hook to see if the identity column behaves as expected.
  • This will help isolate the issue and determine if it’s related to the specific merge logic or the identity column itself.
  • If you encounter any specific error messages or unexpected behaviour, feel free to share them, and we can dive deeper into the issue. 😊🔍

Also, if you have any other questions related to Databricks or dbt, feel free to ask! 🚀

 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group