cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

SaraCorralLou
by New Contributor III
  • 13812 Views
  • 2 replies
  • 0 kudos

Resolved! Differences between lit(None) or lit(None).cast('string')

I want to define a column with null values in my dataframe using pyspark. This column will later be used for other calculations.What is the difference between creating it in these two different ways?df.withColumn("New_Column", lit(None))df.withColumn...

  • 13812 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Sara Corral​ Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedback ...

  • 0 kudos
1 More Replies
SIRIGIRI
by Contributor
  • 770 Views
  • 1 replies
  • 1 kudos

medium.com

Sorting In Spark**How to sort null values First and last of the records in the Spark data frame?Please find the answershttps://medium.com/@sharikrishna26/sorting-in-spark-a57db245ecd4

  • 770 Views
  • 1 replies
  • 1 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 1 kudos

Yeah this is really good post,keep it up Man

  • 1 kudos
DB_developer
by New Contributor III
  • 1464 Views
  • 3 replies
  • 0 kudos
  • 1464 Views
  • 3 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

there is no single answer to this.If you look at parquet, which is a very common format on data lakes:https://parquet.apache.org/docs/file-format/nulls/and on SO

  • 0 kudos
2 More Replies
Kody_Devl
by New Contributor II
  • 4613 Views
  • 3 replies
  • 2 kudos

%SQL Append null values into a SQL Table

Hi All, I am new to Databricks and am writing my first program.Note: Code Shown Below:I am creating a table with 3 columns to store data. 2 of the columns will be appended in from data that I have in another table.When I run my append query into the...

  • 4613 Views
  • 3 replies
  • 2 kudos
Latest Reply
Kody_Devl
New Contributor II
  • 2 kudos

Hi Hubert,Your answer moves me closer to being able to update pieces of a 26 field MMR_Restated table in pieces are the correct fields values are calculated Thru the process. I have been looking for a way to be able to update in "pieces"...... 2 fie...

  • 2 kudos
2 More Replies
sarvesh
by Contributor III
  • 6382 Views
  • 9 replies
  • 8 kudos

Resolved! Getting Null values at the place of data which was removed manually from excel file( solved )

I was reading an excel file with one column,country india India india India indiadataframe i got from this data : df.show()+-------+ |country| +-------+ | india | | India | | india | | India | | india | +-------+In the next step i removed last value ...

  • 6382 Views
  • 9 replies
  • 8 kudos
Latest Reply
Anonymous
Not applicable
  • 8 kudos

@sarvesh singh​ - Thank you for letting us know. Would you be happy to mark the best answer so others can find the solution easily?

  • 8 kudos
8 More Replies
User15787040559
by Databricks Employee
  • 2881 Views
  • 1 replies
  • 0 kudos

How can I create from scratch a brand new Dataframe with Null values using spark.createDataFrame()?

from pyspark.sql.types import * schema = StructType([ StructField("c1", IntegerType(), True), StructField("c2", StringType(), True), StructField("c3", StringType(), True)]) df = spark.createDataFrame([(1, "2", None), (3, "4", None)], schema)

  • 2881 Views
  • 1 replies
  • 0 kudos
Latest Reply
Mooune_DBU
Valued Contributor
  • 0 kudos

df = spark.createDataFrame(sc.emptyRDD(), schema)Can you try this?

  • 0 kudos
Anonymous
by Not applicable
  • 1069 Views
  • 0 replies
  • 0 kudos

Newline characters mess up the table records

When creating tables from text files containing newline characters in the middle of the lines, the table records will null column values because the newline characters in the middle of the lines break the lines into two different records and fill up ...

  • 1069 Views
  • 0 replies
  • 0 kudos
Labels