cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

FiscalYear Start Period Is not Correct

JJ_LVS1
New Contributor III

Hi, I'm trying to create a calendar dimension including a fiscal year with a fiscal start of April 1. I'm using the fiscalyear library and am setting the start to month 4 but it insists on setting April to month 7.

runtime 12.1

My code snipet is:

start_date = '2016-01-01'
    end_date = date(date.today().year, 12, 31) + relativedelta(years=5)
    
    fiscalyear.setup_fiscal_calendar(start_month=4)
    
    df = spark.sql(f"SELECT SEQUENCE(TO_TIMESTAMP('{start_date}'), TO_TIMESTAMP('{end_date}'), INTERVAL 1 DAY) AS DATETIME").withColumn("DATETIME", F.explode(F.col("DATETIME")))
 
    # bulk definition of UDFs
    udf_f_y = udf(lambda CELL: fiscalyear.FiscalDateTime(CELL.year, CELL.month, CELL.day).fiscal_year) # fiscal DateTime object f_y
    udf_f_m = udf(lambda CELL: fiscalyear.FiscalDateTime(CELL.year, CELL.month, CELL.day).fiscal_month) # fiscal DateTime object f_m
    udf_f_m_str = udf(lambda CELL: 'M' + str(CELL).zfill(2)) # use zfill to turn 2 into 02 and attach "M" to the front
    udf_f_q = udf(lambda CELL: fiscalyear.FiscalDateTime(CELL.year, CELL.month, CELL.day).fiscal_quarter) # fiscal DateTime object f_q
    udf_f_d = udf(lambda CELL: fiscalyear.FiscalDateTime(CELL.year, CELL.month, CELL.day).fiscal_day) # fiscal DateTime object f_d
 
    # Fiscal Columns
    
    df = df.withColumn("FISCAL_YEAR_INTEGER", udf_f_y(F.col("DATETIME")))
    df = df.withColumn("FISCAL_QUARTER_INTEGER",  udf_f_q(F.col("DATETIME")))
    df = df.withColumn("FISCAL_MONTH_INTEGER", udf_f_m(F.col("DATETIME")))
 
    df = df.withColumn("FISCAL_DAY_INTEGER", udf_f_d(F.col("DATETIME")))
    df = df.withColumn("FISCAL_YEAR_STRING", F.concat_ws("", F.lit("FY"),  F.col("FISCAL_YEAR_INTEGER").cast("STRING").substr(3, 4)))
    df = df.withColumn("FISCAL_QUARTER_STRING", F.concat_ws("_", F.col("FISCAL_YEAR_STRING"), F.concat_ws("", F.lit( "Q" ), F.col("FISCAL_QUARTER_INTEGER")) ) )
    df = df.withColumn("FISCAL_MONTH_STRING",  F.concat_ws("_", F.col("FISCAL_YEAR_STRING"), udf_f_m_str(F.col("FISCAL_MONTH_INTEGER"))))

Not sure if it's something to do with using the udf and the lambda cell because I've tested it manually with the following and I properly get April as fiscal month 1:

import fiscalyear
from pyspark.sql.functions import col
from pyspark.sql.functions import to_timestamp
 
#### Testing fiscalyear library #####
 
# Set the fiscal year start month to April (month 4)
fiscalyear.START_MONTH = 4
 
# # Create a FiscalDateTime object representing a calendar date
df = fiscalyear.FiscalDateTime(2023, 4, 1)
 
# Convert the calendar date to a fiscal year
fiscal_year = df.fiscal_year
fiscal_month = df.fiscal_month
 
print(fiscal_year, fiscal_month)
 
2024 1

Can anyone offer any insights? Thank you for looking.

4 REPLIES 4

Anonymous
Not applicable

@Jason Johnson​ :

It seems that the fiscalyear library you are using is actually setting the start month to July (month 7) instead of April (month 4) as you intended. This might be a bug or a known issue with the library.

To work around this, you can manually adjust the month of your date inputs to compensate for the offset. For example, you can subtract 3 from the month of your start date to shift it back to April:

start_date = '2016-01-01'
start_date = f'{start_date[:5]}{int(start_date[5:7])-3:02d}{start_date[7:]}'
 
# start_date is now '2016-04-01'

Similarly, you can subtract 3 from the month of your fiscal year calculation to get the correct fiscal month:

udf_f_m = udf(lambda CELL: fiscalyear.FiscalDateTime(CELL.year, CELL.month, CELL.day).fiscal_month - 3)

With these adjustments, your code should correctly generate a calendar dimension with a fiscal year starting in April.

JJ_LVS1
New Contributor III

Thanks for the reply. Not sure why I hadn't thought of doing this. I'll add these changes in and see how it turns out.

Anonymous
Not applicable

Hi @Jason Johnson​ 

Thank you for posting your question in our community! We are happy to assist you.

Help us build a vibrant and resourceful community by recognizing and highlighting insightful contributions. Mark the best answers and show your appreciation!

DataEnginner
New Contributor II

 

import fiscalyear
import datetime

def get_fiscal_date(year,month,day):
  fiscalyear.setup_fiscal_calendar(start_month=4)
  v_fiscal_month=fiscalyear.FiscalDateTime(year, month, day).fiscal_month #To get the Fiscal Month
  v_fiscal_quarter=fiscalyear.FiscalDateTime(year, month, day).fiscal_quarter #To get the Fiscal Month
  v_fiscal_year=fiscalyear.FiscalDateTime(year, month, day).fiscal_year #To get the Fiscal Month
  return v_fiscal_month,v_fiscal_quarter,v_fiscal_year


from pyspark.sql.functions import udf,col
from pyspark.sql.types import ArrayType, IntegerType,StringType

# Define the UDF with return type of ArrayType containing three IntegerTypes
get_fiscal_date_udf = udf(lambda year, month, day: get_fiscal_date(year, month, day), ArrayType(StringType()))

# Register the UDF with Spark SQL
spark.udf.register("get_fiscal_date_udf", get_fiscal_date,(ArrayType(StringType())))

%sql
select get_fiscal_date_udf(2024,4,24), get_fiscal_date_udf(2023,12,24), get_fiscal_date_udf(2024,1,24), get_fiscal_date_udf(2024,8,24)

In My case, April is the Financial start month, so in UDF function, we have to set 

fiscalyear.setup_fiscal_calendar(start_month=4).
 
But in ur case ur calling directly main function, by default its taking in build start month

 

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.