cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

How to count the number of campaigns per day based on the start and end dates of the campaigns in SQL Spark Databrick

dulu
New Contributor III

I need to count the number of campaigns per day based on the start and end dates of the campaigns

Input Table:

image 1 

Out needed (result):

image 2How do I need to write the SQL command in databricks to get the above result? thanks all

1 ACCEPTED SOLUTION

Accepted Solutions

UmaMahesh1
Honored Contributor III

Hi @Du Luโ€‹ 

Step1 : Collect all the dates between start date and end date as the campaign is happening on those dates.

Step2; Explode the collected date set

Step3: Count the number of campaigns happening on a particular date using a simple grouping expression.

Query looks something like this : (assuming startDate and EndDate are date columns and not strings. If strings, cast to date before proceeding).

select count(campaignName) as CampaignCount , EventDate from

( select campaignName, explode(sequence(StartDate, EndDate, interval 1 day)) as EventDate from source)

group by 2

If you want specific date range you can do that by adding a where clause in the above query.

select count(campaignName) as CampaignCount , EventDate from

( select campaignName, explode(sequence(StartDate, EndDate, interval 1 day)) as EventDate from source)

where EventDate between "2022-07-06" and "2022-08-03"

group by 2

Hope this helps...

Cheers.

View solution in original post

3 REPLIES 3

UmaMahesh1
Honored Contributor III

Hi @Du Luโ€‹ 

Step1 : Collect all the dates between start date and end date as the campaign is happening on those dates.

Step2; Explode the collected date set

Step3: Count the number of campaigns happening on a particular date using a simple grouping expression.

Query looks something like this : (assuming startDate and EndDate are date columns and not strings. If strings, cast to date before proceeding).

select count(campaignName) as CampaignCount , EventDate from

( select campaignName, explode(sequence(StartDate, EndDate, interval 1 day)) as EventDate from source)

group by 2

If you want specific date range you can do that by adding a where clause in the above query.

select count(campaignName) as CampaignCount , EventDate from

( select campaignName, explode(sequence(StartDate, EndDate, interval 1 day)) as EventDate from source)

where EventDate between "2022-07-06" and "2022-08-03"

group by 2

Hope this helps...

Cheers.

Hubert-Dudek
Esteemed Contributor III

Just create an array with sequence, explode it, and then group and count:

WITH cte AS
(SELECT
  `campaign name`, 
  explode(sequence(`Start date`, `End date`, interval 1 day)) as `Date`
FROM
  `campaigns`)
SELECT
  Count(`campaign name`) as `count unique campaigns`,
  `Date`
FROM
  cte
GROUP BY
  `Date`;

Nice query...

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.