Databricks Community

HariharaSam · ‎01-12-2022

Consider we have two tables A & B.

qry = """

INSERT INTO Table A

Select * from Table B where Id is null

"""

spark.sql(qry)

I need to get the number of records inserted after running this in databricks.

Hubert-Dudek · ‎01-14-2022

@@ROWCOUNT is rather T-SQL function not Spark SQL. I haven't found something like that in documentation but there is other way as every insert anyway return num_affected_rows and num_inserted_rows fields.

So you can for example use

df.first()['num_inserted_rows']

or subquery and select in sql syntax.

I am including example screenshots.

View solution in original post

Kaniz_Fatma · ‎01-13-2022

Hi @Hariharan Sambath , You can use @@ROW_COUNT just after your insert statements.

HariharaSam · ‎01-13-2022

Hi ,

I am getting an syntax error when I run @@ROW_COUNT after the insert statement.

I am running the code in Databricks

Kaniz_Fatma · ‎01-13-2022

Hi @Hariharan Sambath ,

use @@ROW_COUNT just after your insert statements,

qry = """
 
INSERT INTO Table A
 
Select * from Table B where Id is null
 
Select @@ROWCOUNT
 
"""
spark.sql(qry)

HariharaSam · ‎01-13-2022

Hi @Kaniz Fatma ,

I have tried the way you have mentioned but it still throws an error.

HariharaSam · ‎01-13-2022

Hi ,

My requirement here is I will be creating a function using Python code to perform insert operation to a Delta table , that is why I am running it in an Python cell.

I will be passing a table name to that function and I need to get the number of records inserted into the table once the function is executed.

So any solution to achieve this?

Hubert-Dudek · ‎01-14-2022

@@ROWCOUNT is rather T-SQL function not Spark SQL. I haven't found something like that in documentation but there is other way as every insert anyway return num_affected_rows and num_inserted_rows fields.

So you can for example use

df.first()['num_inserted_rows']

or subquery and select in sql syntax.

I am including example screenshots.

HariharaSam · ‎01-14-2022

Hi @Hubert Dudek

Your approach is working for me.

Thank you.

Hubert-Dudek · ‎01-14-2022

Great! Please when you can select as best answer.

Tim3 · ‎06-14-2023

@Hubert Dudek, when I execute a similar piece of code in VSCode executed through databricks-connect, the dataframe contains 1 row with no columns, which is a problem. Executing the same code in a notebook on the same cluster works as you stated. Is this possibly a bug in databricks-connect?

GRCL · ‎06-15-2023

Almost same advice than Hubert, I use the history of the delta table :

df_history.select(F.col('operationMetrics')).collect()[0].operationMetrics['numOutputRows']

You can find also other 'operationMetrics' values, like 'numTargetRowsDeleted'.

Databricks Community

To get Number of rows inserted after performing an Insert operation into a table

Connect with Databricks Users in Your Area

Databricks Learning Festival (Virtual): 10 October - 31 October

GenAI: The Shift to Data Intelligence

Big Book of Data Engineering — 3rd Edition

Establish your Generative AI expertise with the latest Databricks certification

Introducing Databricks Assistant Quick Fix