cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Value of SQL variable in IF statement using Spark SQL

RicardoS
New Contributor II

Hi there,

I am new to Spark SQL and would like to know if it possible to reproduce the below T-SQL query in Databricks. This is a sample query, but I want to determine if a query needs to be executed or not.

 

DECLARE
      @VariableA AS INT
,     @VariableB AS INT;

SELECT @VariableA = COUNT(*)
FROM [xxx].[TableExample]

SELECT @VariableB = COUNT(*) + 1
FROM [xxx].[TableExample]

IF (@VariableA = @VariableB)
BEGIN
      PRINT 'Equal'
END
ELSE
BEGIN
      PRINT 'Not equal'
END

 

I want to use an IF statement to determine it. I already know how to set a variable using a SELECT statement:

 

SET VariableA = SELECT COUNT(*) FROM [xxx].[TableExample];
SET VariableB = SELECT COUNT(*) + 1 FROM [xxx].[TableExample];

 

Next I want to use VariableA and VariableB in an IF statement:

 

IF (${VariableA} = ${VariableB}, 'Equal', 'Not equal');

 

This is not working, because underwater this expression writes down the full SELECT statements of both variables instead of the results (numbers) to compare. When I write down only ${VariableA}; it gives me the correct result (number).

How can I make this work in the IF statement?

So to be clear, no PySpark solution, but SQL (Spark). Thank you!

3 REPLIES 3

Suman1985
New Contributor II

Hello ,  I read your question I just tell you that if you try following **bleep** , 

In Databricks, you can achieve similar conditional logic using Spark SQL. However, Spark SQL doesn't directly support variable assignment and manipulation in the same way as T-SQL. Instead, you would typically achieve your goal using DataFrames or SQL expressions. Here's how you can reproduce the logic you provided using Spark SQL:

-- Load the table as a DataFrame
val df = **bleep**.TableExample")

-- Calculate counts using DataFrames
val variableA = df.count()
val variableB = df.count() + 1

-- Use CASE WHEN for conditional logic
val result = spark.sql(
s"""
SELECT CASE WHEN $variableA = $variableB THEN 'Equal' ELSE 'Not equal' END AS result
"""
)

result.show()

****bleep** in a DataFrame.

Please note that Spark SQL does not directly support the use of variables like T-SQL does. Instead, you work with DataFrames and SQL expressions to achieve similar functionality. Myjdfaccount

 

I hope you like my answer Thank you i always try to give my best.

Best Regards,

RicardoS
New Contributor II

Hi there,

Thank you for your response.

Using DataFrames and SQL expressions can still be done in a Spark SQL type of Notebook or is this already PySpark?

Thank you in advance!

Edthehead
Contributor

Since you are looking for a single value back, you can use the CASE function to achieve what you need.

%sql
SET var.myvarA = (SELECT 6);
SET var.myvarB = (SELECT 7);
SELECT CASE WHEN ${var.myvarA} = ${var.myvarB} THEN 'Equal' ELSE 'Not equal' END AS result

 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group