- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-17-2021 07:00 AM
I was reading an excel file with one column,
country
india
India
india
India
india
dataframe i got from this data : df.show()
+-------+ |country| +-------+ |
india |
| India |
| india |
| India |
| india | +-------+
In the next step i removed last value from the excel file manually by backspace and saved the file,
file now :
country
india
India
india
India
now when i run the same df.show(), this is what i get:
+-------+ |country| +-------+ |
india |
|India |
|india |
|India | |
null | +-------+
if i have removed the value why do i get a null value at its place?
and my code if someone needs it,
val spark = SparkSession
.builder
.appName("schemaTest")
.master("local[*]")
.getOrCreate()
val df = spark.read
.format("com.crealytics.spark.excel").
option("header", "false").
option("inferSchema", "true").
option("treatEmptyValuesAsNulls", "false").
option("addColorColumns", "False").
load("data/trimTest2.xlsx")
df.show()
edit : when i was changing some value in my excel file, i was using backspace instead of delete row in excel, which made excel thing that there is a row which is blank,
but if you use delete row then excel deletes the complete row and spark does not read anything there.
- Labels:
-
Apache spark
-
Null Values
-
Scala