Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-31-2022 06:47 AM
The date field is getting changed while reading data from source .xls file to the dataframe. In the source xl file all columns are strings but i am not sure why date column alone behaves differently
In Source file date is 1/24/2022.
In dataframe it is 1/24/22
Code used:
from pyspark.sql.functions import *
import pyspark.sql.functions as sf
import pyspark.sql.types
import pandas as pd
import os
import glob
filenames = glob.glob(PathSource + "/*.xls")
dfs = []
for df in dfs:
xl_file = pd.ExcelFile(filenames)
df=xl_file.parse('Sheet1')
dfs.concat(df, ignore_index=True)
display(df)
Thanks in Advance for any help or guidance.
Labels:
- Labels:
-
Databricks SQL
-
Date Field