Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-26-2021 06:14 AM
Hi ,
We can also read CSV directly without writing it to DBFS.
Scala spark Approach
import org.apache.commons.io.IOUtils // jar will be already there in spark cluster no need to worry
import java.net.URL
val urlfile=new URL("https://people.sc.fsu.edu/~jburkardt/data/csv/airtravel.csv")
val testDummyCSV = IOUtils.toString(urlfile,"UTF-8").lines.toList.toDS()
val testcsv = spark
.read.option("header", true)
.option("inferSchema", true)
.csv(testDummyCSV)display(testcsv)
Notebook attached