04-06-2023 11:32 PM
Data from Azure sql server was read into databricks through JDBC connection (spark version 2.x) and stored into Gen1. Now the client wants to migrate the data from Gen1 to Gen2. When we ran the same jobs that read data from Azure Sql Server to Databricks through JDBC( spark version upgraded from 2.x to 3.2) source side DATE type columns are populating as STRING. Except the spark version updation there is no techincal or functional change or no schema change in the source. Unable to find the root cause. Can someone help me find the exact issue?
04-11-2023 11:35 PM
f.e. there is a spark option to enable the 'old' date handling.
You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0.
Frankly I am not a fan of that approach as Spark 3 gives you a lot of interesting date functions.
So what you could do is to first identify where you have date columns, and explicitly cast them to dates with the to_date function.
04-06-2023 11:55 PM
Spark 2.x and Spark 3.x handle dates differently.
Running spark 2.x scripts on spark 3.x will very likely have issues.
Please check the spark 3 migration guide:
https://spark.apache.org/docs/3.0.2/sql-migration-guide.html#upgrading-from-spark-sql-24-to-30
04-11-2023 10:18 PM
@Werner Stinckens , Above link was extensive and very helpful, however i didn't get the exact details from it. Could you be more specific.
04-07-2023 11:39 PM
Hi @Mani Teja G
Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.
Please help us select the best solution by clicking on "Select As Best" if it does.
Your feedback will help us ensure that we are providing the best possible service to you. Thank you!
04-11-2023 11:35 PM
f.e. there is a spark option to enable the 'old' date handling.
You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0.
Frankly I am not a fan of that approach as Spark 3 gives you a lot of interesting date functions.
So what you could do is to first identify where you have date columns, and explicitly cast them to dates with the to_date function.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group