cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

How do I cast using a DataFrame?

cfregly
Contributor
 
5 REPLIES 5

cfregly
Contributor

You can use HiveQL's cast() type conversion function to cast an element of a nested map in Python as follows:

from pyspark.sql import Row 
df = sqlContext.createDataFrame([Row(a={'b': 1})])
str = df.selectExpr("cast(a['b'] AS STRING)")

or in Scala as follows:

val df = Seq((Map("a" -> 1))).toDF("a") 
df.selectExpr("cast(a['a'] AS STRING)")

Grr
New Contributor II

If your df is registered as a table you can also do this with a SQL call:

df.createOrReplaceTempView("table")
str = spark.sql('''
    SELECT CAST(a['b'] AS STRING)
    FROM table
''')

Its more code in the simple case but I have found in the past that when this is combined into a much more complex query the SQL format can be more friendly from a readability standpoint.

DarrellUlm
New Contributor II

Could also use withColumn() to do it without Spark-SQL, although the performance will likely be different. The question being, would creating a new column take more time than using Spark-SQL.

Something like:

val dfNew = df.withColumn("newColName", df.originalColName.cast(IntegerType))
    .drop("originalColName").withColumnRenamed("newColName", "originalColName")

Create the new column, casting from the original column, drop the original, then rename the new column back to the original name. A bit roundabout, but looks like it could work.

Is it safe to cast a column that contains null values?

srisre111
New Contributor II

I am trying to store a dataframe as table in databricks and encountering the following error, can someone help?

"typeerror: field date: can not merge type <class 'pyspark.sql.types.stringtype'> and <class 'pyspark.sql.types.doubletype'>"

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group