<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Date field getting changed when reading from excel file to dataframe in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/date-field-getting-changed-when-reading-from-excel-file-to/m-p/24081#M16704</link>
    <description>&lt;P&gt;Hi&amp;nbsp;&lt;A href="https://community.databricks.com/s/profile/0053f000000WpLyAAK" alt="https://community.databricks.com/s/profile/0053f000000WpLyAAK" target="_blank"&gt;@sreedata&lt;/A&gt;&amp;nbsp;(Customer)​&amp;nbsp;, Just a friendly follow-up. Do you still need help, or&amp;nbsp;&lt;A href="https://community.databricks.com/s/profile/0053f000000WYaIAAW" alt="https://community.databricks.com/s/profile/0053f000000WYaIAAW" target="_blank"&gt;@merca&lt;/A&gt;&amp;nbsp;(Customer)​&amp;nbsp;'s response help you to find the solution? Please let us know.&lt;/P&gt;</description>
    <pubDate>Thu, 19 May 2022 06:16:10 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2022-05-19T06:16:10Z</dc:date>
    <item>
      <title>Date field getting changed when reading from excel file to dataframe</title>
      <link>https://community.databricks.com/t5/data-engineering/date-field-getting-changed-when-reading-from-excel-file-to/m-p/24078#M16701</link>
      <description>&lt;P&gt;The date field is getting changed while reading data from source .xls file to the dataframe. In the source xl file all columns are strings but i am not sure why date column alone behaves differently&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;In Source file date is 1/24/2022.&lt;/P&gt;&lt;P&gt;In dataframe it is 1/24/22&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Code used:&lt;/P&gt;&lt;P&gt;&lt;I&gt;from pyspark.sql.functions import *&lt;/I&gt;&lt;/P&gt;&lt;P&gt;&lt;I&gt;import pyspark.sql.functions as sf&lt;/I&gt;&lt;/P&gt;&lt;P&gt;&lt;I&gt;import pyspark.sql.types&lt;/I&gt;&lt;/P&gt;&lt;P&gt;&lt;I&gt;import pandas as pd&lt;/I&gt;&lt;/P&gt;&lt;P&gt;&lt;I&gt;import os&lt;/I&gt;&lt;/P&gt;&lt;P&gt;&lt;I&gt;import glob&lt;/I&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;I&gt;filenames = glob.glob(PathSource + "/*.xls")&lt;/I&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;I&gt;dfs = []&lt;/I&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;I&gt;for df in dfs:&amp;nbsp;&lt;/I&gt;&lt;/P&gt;&lt;P&gt;&lt;I&gt;&amp;nbsp;&amp;nbsp;xl_file = pd.ExcelFile(filenames)&lt;/I&gt;&lt;/P&gt;&lt;P&gt;&lt;I&gt;&amp;nbsp;&amp;nbsp;df=xl_file.parse('Sheet1')&lt;/I&gt;&lt;/P&gt;&lt;P&gt;&lt;I&gt;&amp;nbsp;&amp;nbsp;dfs.concat(df, ignore_index=True)&lt;/I&gt;&lt;/P&gt;&lt;P&gt;&lt;I&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/I&gt;&lt;/P&gt;&lt;P&gt;&lt;I&gt;display(df)&lt;/I&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks in Advance for any help or guidance.&lt;/P&gt;</description>
      <pubDate>Thu, 31 Mar 2022 13:47:42 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/date-field-getting-changed-when-reading-from-excel-file-to/m-p/24078#M16701</guid>
      <dc:creator>sreedata</dc:creator>
      <dc:date>2022-03-31T13:47:42Z</dc:date>
    </item>
    <item>
      <title>Re: Date field getting changed when reading from excel file to dataframe</title>
      <link>https://community.databricks.com/t5/data-engineering/date-field-getting-changed-when-reading-from-excel-file-to/m-p/24079#M16702</link>
      <description>&lt;P&gt;@srikanth nair​&amp;nbsp;, Have you checked the output in pandas and eventually pass the &lt;B&gt;&lt;I&gt;parse_dates=False &lt;/I&gt;&lt;/B&gt;to ignore dates. Pandas uses dateutil.parser.parser as default&lt;/P&gt;</description>
      <pubDate>Sat, 02 Apr 2022 17:31:28 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/date-field-getting-changed-when-reading-from-excel-file-to/m-p/24079#M16702</guid>
      <dc:creator>merca</dc:creator>
      <dc:date>2022-04-02T17:31:28Z</dc:date>
    </item>
    <item>
      <title>Re: Date field getting changed when reading from excel file to dataframe</title>
      <link>https://community.databricks.com/t5/data-engineering/date-field-getting-changed-when-reading-from-excel-file-to/m-p/24081#M16704</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;A href="https://community.databricks.com/s/profile/0053f000000WpLyAAK" alt="https://community.databricks.com/s/profile/0053f000000WpLyAAK" target="_blank"&gt;@sreedata&lt;/A&gt;&amp;nbsp;(Customer)​&amp;nbsp;, Just a friendly follow-up. Do you still need help, or&amp;nbsp;&lt;A href="https://community.databricks.com/s/profile/0053f000000WYaIAAW" alt="https://community.databricks.com/s/profile/0053f000000WYaIAAW" target="_blank"&gt;@merca&lt;/A&gt;&amp;nbsp;(Customer)​&amp;nbsp;'s response help you to find the solution? Please let us know.&lt;/P&gt;</description>
      <pubDate>Thu, 19 May 2022 06:16:10 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/date-field-getting-changed-when-reading-from-excel-file-to/m-p/24081#M16704</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2022-05-19T06:16:10Z</dc:date>
    </item>
    <item>
      <title>Re: Date field getting changed when reading from excel file to dataframe</title>
      <link>https://community.databricks.com/t5/data-engineering/date-field-getting-changed-when-reading-from-excel-file-to/m-p/24082#M16705</link>
      <description>&lt;P&gt;working fine now thanks&lt;/P&gt;</description>
      <pubDate>Thu, 19 May 2022 13:45:57 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/date-field-getting-changed-when-reading-from-excel-file-to/m-p/24082#M16705</guid>
      <dc:creator>sreedata</dc:creator>
      <dc:date>2022-05-19T13:45:57Z</dc:date>
    </item>
    <item>
      <title>Re: Date field getting changed when reading from excel file to dataframe</title>
      <link>https://community.databricks.com/t5/data-engineering/date-field-getting-changed-when-reading-from-excel-file-to/m-p/24083#M16706</link>
      <description>&lt;P&gt;Hi Team, @Merca Ovnerud​&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I am also facing same issue , below is the code snippet which I am using &lt;/P&gt;&lt;P&gt;df=spark.read.format("com.crealytics.spark.excel").option("header","true").load("/mnt/dataplatform/Tenant_PK/Results.xlsx")&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I have a couple of date columns , all are showing dd/mm/yy format but it has to come as dd/mm/yyyy format&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;source file has: 26-03-1950&lt;/P&gt;&lt;P&gt;Dataframe has : 26-03-50&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I have used &lt;B&gt;&lt;I&gt;parse_dates=False &lt;/I&gt;&lt;/B&gt;&lt;I&gt; but it is not working, Can any one help on this&lt;/I&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 17 Nov 2022 14:56:19 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/date-field-getting-changed-when-reading-from-excel-file-to/m-p/24083#M16706</guid>
      <dc:creator>Pradeep_Namani</dc:creator>
      <dc:date>2022-11-17T14:56:19Z</dc:date>
    </item>
  </channel>
</rss>

