05-24-2024 03:40 AM
This code fails with exception:
[NOT_COLUMN_OR_STR] Argument `col` should be a Column or str, got Column.
File <command-4420517954891674>, line 7 4 spark = DatabricksSession.builder.getOrCreate() 6 df = spark.read.table("samples.nyctaxi.trips") ----> 7 df.select(lit(5).alias('height')).show()
from databricks.connect import DatabricksSession
from pyspark.sql.functions import lit
spark = DatabricksSession.builder.getOrCreate()
df = spark.range(1)
df.select(lit(5).alias('height'), df.id).show()
Can you confirm this is a bug?
05-27-2024 10:33 PM
I am sorry but this is not helpfull. ChatGPT does not work well for PySpark code
05-27-2024 07:17 AM - edited 05-27-2024 07:20 AM
It's an official example from pyspark documentation:
It works on older runtime, it used to work one week ago. Please fix your internal databricks connect on latest runtimes.
05-27-2024 10:35 PM
We dont understand the issue becauuse it suddently appeared but fixed it with migrating to 15.2.
Maybe databricks released some 15.1.XXXX update that broke stuff?
05-29-2024 08:27 AM
Hi, I'm having the same problem using the 14.3LTS runtime.
The error just appeared yesterday. Before that, everything was working fine.
05-29-2024 02:48 PM
We are also seeing this error in 14.3 LTS from a simple example:
from pyspark.sql.functions import col
df = spark.table('things')
things = df.select(col('thing_id')).collect()
[NOT_COLUMN_OR_STR] Argument `col` should be a Column or str, got Column.
09-10-2024 05:51 PM
I can see this issue in 13.3 LTS, production code still running in 11.3LTS but upgradding to higher LTS DBR version gives this error. I believe you should fix it or provide a migration guide from one DBR to the other
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group