cancel
Showing results for 
Search instead for 
Did you mean: 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 

Collation problem with df.first() when different from UTF8_BINARY

MDV
New Contributor III

I'm getting a error when I want to select the first() from a dataframe when using a collation different than UTF8_BINARY

This works :

df_result = spark.sql(f"""
                        SELECT 'en-us' AS ETLLanguageCode
""")
display(df_result)
print(df_result.collect())
print(df_result.first())
print(df_result.first().asDict())
 
When I run this : 
 
df_result = spark.sql(f"""
                        SELECT 'en-us' COLLATE UTF8_LCASE AS ETLLanguageCode
""")
display(df_result)
print(df_result.collect())
print(df_result.first())
print(df_result.first().asDict())
 
I'm getting an error because the first() is empty, the count from the df says 1 
 
What can I do to resolve this ? My tables are all UTF8_LCASE for the strings.
 
Settings :
1-1 Worker
16-16 GB Memory4-4 Cores
1 Driver
16 GB Memory, 4 Cores
Runtime
16.3.x-scala2.12
Unity Catalog
Photon
Standard_D4ds_v5
 
 
1 REPLY 1

MDV
New Contributor III

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now