cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

DatabricksSession broken for 15.1

TWib
New Contributor III

This code fails with exception:

[NOT_COLUMN_OR_STR] Argument `col` should be a Column or str, got Column.
File <command-4420517954891674>, line 7  4 spark = DatabricksSession.builder.getOrCreate()  6 df = spark.read.table("samples.nyctaxi.trips") ----> 7 df.select(lit(5).alias('height')).show()

 

 

from databricks.connect import DatabricksSession
from pyspark.sql.functions import lit 

spark = DatabricksSession.builder.getOrCreate()

df = spark.range(1)
df.select(lit(5).alias('height'), df.id).show()

 

 

 Can you confirm this is a bug?

8 REPLIES 8

TWib
New Contributor III

I am sorry but this is not helpfull. ChatGPT does not work well for PySpark code

MM3
New Contributor II

It's an official example from pyspark documentation:

https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/api/pyspark.sql.functions.lit....

It works on older runtime, it used to work one week ago. Please fix your internal databricks connect on latest runtimes.

TWib
New Contributor III

We dont understand the issue becauuse it suddently appeared but fixed it with migrating to 15.2.

Maybe databricks released some 15.1.XXXX update that broke stuff?

zerodarkzone
New Contributor III

Hi, I'm having the same problem using the 14.3LTS runtime.

The error just appeared yesterday. Before that, everything was working fine.

jcap
New Contributor II

We are also seeing this error in 14.3 LTS from a simple example:

from pyspark.sql.functions import col

df = spark.table('things')
things = df.select(col('thing_id')).collect()

[NOT_COLUMN_OR_STR] Argument `col` should be a Column or str, got Column.

 

 

977073
New Contributor II

I can see this issue in 13.3 LTS, production code still running in 11.3LTS but upgradding to higher LTS DBR version gives this error. I believe you should fix it or provide a migration guide from one DBR to the other

yigalk
New Contributor II

I also get the same for runtime 13.3 LTS. The same code with 15.2 LTS seems to work.

df.withColumn("new_col", concat("col1", lit("-"), "col2"))

yigalk
New Contributor II

So actually it isn't working on runtime 15.

It is working on a shared cluster in runtime 15.4. But then I need also to use rdd for something and it fails on shared clusters. 

D

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group