Hi there @Tiwarisk,
if this is the major issue
@Tiwarisk wrote:
I am writing a file using this but the data type of columns get changed while reading.
You can explicitly specify your table schema like this
from pyspark.sql.types import StructType, StructField, StringType, IntegerType, DoubleType
schema = StructType([
StructField("column1", StringType(), True),
StructField("column2", IntegerType(), True),
StructField("column3", DoubleType(), True)
])
Then you can read the Excel file like this
// Read the Excel file with the specified schema
val df = spark.read
.format("com.crealytics.spark.excel")
.option("header", "true")
.schema(schema) // Specify the schema here
.load(path)
After this when you write it won't cause trouble because When writing data to an Excel file using the `com.crealytics.spark.excel` format, you might encounter issues where the data types of the columns are altered. This happens because the Excel format doesn't natively support all Spark data types, and the conversion might not be perfect.
@Tiwarisk wrote: df.write.format("com.crealytics.spark.excel").option("header", "true").mode("overwrite").save(path)