Multiple sources found for csv
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-22-2024 04:22 AM
When I run a job using spark2 jar, I then run a python job to report :Multiple sources found for csv (org.apache.spark.sql.execution.datasources.v2.csv.CSVDataSourceV2, org.apache.spark.sql.execution.datasources.csv.CSVFileFormat), please specify the fully qualified class name.
My python code is
df = self.spark.read.csv(full_data_file_path, header=False, schema=schema,sep=sep)
1 REPLY 1
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-22-2024 05:38 AM
@Jackson1111
It looks like you've installed two different libraries to handle CSV data.
You need to specify which one you want to use, ex:
df = self.spark.read.format("org.apache.spark.sql.execution.datasources.v2.csv.CSVDataSourceV2").option("header", False).option("schema", schema).option("sep", sep).load(full_data_file_path)

