cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

date_format 'LLLL' returns '1'

Kibour
Contributor

Hi all,

In my notebook, when I run my cell with following code

%sql

select date_format(date '1970-01-01', "LLL");

I get '1', while I expect 'Jan' according to the doc
https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html

 

I would also expect the following to return 'January', but I get '1' instead

%sql
select to_csv(named_struct('date', date '1970-01-01'), map('dateFormat', 'LLLL', 'locale', 'EN'));
 
What am I missing?
 
Thanks in advance for your insights
1 ACCEPTED SOLUTION

Accepted Solutions

Kibour
Contributor

Hi @Kaniz ,

Turns out it was actually a Java 8 bug:

IllegalArgumentException: Java 8 has a bug to support stand-alone form (3 or more 'L' or 'q' in the pattern string). Please use 'M' or 'Q' instead, or upgrade your Java version. For more details, please read https://bugs.openjdk.java.net/browse/JDK-8114833

Upgarding to Java 11 fixed the issue.

Kind regards

View solution in original post

3 REPLIES 3

Kaniz
Community Manager
Community Manager

Hi @Kibour

It seems like you’re encountering unexpected behaviour when formatting dates in your Databricks Notebook using Spark SQL.

Let’s break down the issues you’re facing:

  1. date_format Issue: When you run the following query:

    %sql
    select date_format(date '1970-01-01', "LLL");
    

    You expect it to return ‘Jan’, but it gives you ‘1’. The issue lies in the date format pattern.

    Solution: The problem occurs because you used "LLL" as the format pattern.

  2. To get the abbreviated month name (like ‘Jan’), you should use "MMM" instead. Here’s the corrected query:

    %sql
    select date_format(date '1970-01-01', "MMM");
    
  3. to_csv Issue: Next, you’re trying to convert a named struct containing a date to CSV format with the following query:

    %sql
    select to_csv(named_struct('date', date '1970-01-01'), map('dateFormat', 'LLLL', 'locale', 'EN'));
    

    You expect it to return ‘January’, but it gives you ‘1’.

    Solution: Similar to the previous issue, the problem lies in the format pattern.

  4. To get the full month name (like ‘January’), use "MMMM" instead of "LLLL":

    %sql
    select to_csv(named_struct('date', date '1970-01-01'), map('dateFormat', 'MMMM', 'locale', 'EN'));
    

Remember to adjust the format patterns as suggested above, and your queries should return the expected results! 😊

 

Hi @Kaniz ,

Thanks for your detailed answer. If I read you correctly, this means the spark docuementation is incorrect. I copy-pasted my queries direclty from the documentation examples, but can't reproduce them.

My cluster is runnig with Apache Spark 3.5.0, Scala 2.12.

Thanks and kind regards

Kibour
Contributor

Hi @Kaniz ,

Turns out it was actually a Java 8 bug:

IllegalArgumentException: Java 8 has a bug to support stand-alone form (3 or more 'L' or 'q' in the pattern string). Please use 'M' or 'Q' instead, or upgrade your Java version. For more details, please read https://bugs.openjdk.java.net/browse/JDK-8114833

Upgarding to Java 11 fixed the issue.

Kind regards

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.