<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: to_date not functioning as expected after introduction of arbitrary replaceWhere in Databricks 9.1 LTS in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/to-date-not-functioning-as-expected-after-introduction-of/m-p/31714#M23099</link>
    <description>&lt;P&gt;yes but you can at least do it for testing even with small copy of dataframe, this way is easier to find the problem&lt;/P&gt;</description>
    <pubDate>Fri, 14 Jan 2022 14:54:19 GMT</pubDate>
    <dc:creator>Hubert-Dudek</dc:creator>
    <dc:date>2022-01-14T14:54:19Z</dc:date>
    <item>
      <title>to_date not functioning as expected after introduction of arbitrary replaceWhere in Databricks 9.1 LTS</title>
      <link>https://community.databricks.com/t5/data-engineering/to-date-not-functioning-as-expected-after-introduction-of/m-p/31706#M23091</link>
      <description>&lt;P&gt;I am trying to do a dynamic partition overwrite on delta table using replaceWhere option. This was working fine until I upgraded the DB runtime to 9.1 LTS from 8.3.x. I am concatenating 'year', 'month' and 'day' columns and then using to_date function to define the replaceWhere condition. I could find in the documentation that there was a new feature added as part of 9.1 LTS  - &lt;A href="https://docs.databricks.com/release-notes/runtime/9.1.html?_ga=2.79512596.1662735161.1642082947-671103136.1634111210#delta-now-supports-arbitrary-replacewhere" alt="https://docs.databricks.com/release-notes/runtime/9.1.html?_ga=2.79512596.1662735161.1642082947-671103136.1634111210#delta-now-supports-arbitrary-replacewhere" target="_blank"&gt;Delta now supports arbitrary replaceWhere&lt;/A&gt;. Probably this has introduced a bug here. I am using the spark conf setting to fallback to old behaviour.&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;spark.conf.set("spark.databricks.delta.replaceWhere.dataColumns.enabled", false)&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;But would be nice to have a better solution for this problem. Here is my write statement with replaceWhere.&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;dataframeToWrite.write.format("delta").mode("overwrite").partitionBy("year","month","day").option("replaceWhere", s"to_date(concat(year, '-', month, '-', day)) in ($datesList)").save(outputLocation)&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Error message:&lt;/P&gt;&lt;P&gt;Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 11.0 failed 4 times, most recent failure: Lost task 0.3 in stage 11.0 (TID 13) (): java.lang.UnsupportedOperationException: Cannot generate code for expression: to_date(concat(cast(year#2810 as string), -, cast(month#2863 as string), -, cast(day#2916 as string)), None)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;PS: I've also tried specifying the format in to_date, but it doesn't help either.&lt;/P&gt;</description>
      <pubDate>Thu, 13 Jan 2022 15:46:18 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/to-date-not-functioning-as-expected-after-introduction-of/m-p/31706#M23091</guid>
      <dc:creator>kolangareth</dc:creator>
      <dc:date>2022-01-13T15:46:18Z</dc:date>
    </item>
    <item>
      <title>Re: to_date not functioning as expected after introduction of arbitrary replaceWhere in Databricks 9.1 LTS</title>
      <link>https://community.databricks.com/t5/data-engineering/to-date-not-functioning-as-expected-after-introduction-of/m-p/31707#M23092</link>
      <description>&lt;P&gt;Hi @Prasanth Kolangareth​&amp;nbsp;- Welcome and thank you for your question. I'm sorry to hear about the trouble you're having. Let's give the community a chance to respond first. Thanks for your patience.&lt;/P&gt;</description>
      <pubDate>Thu, 13 Jan 2022 17:52:10 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/to-date-not-functioning-as-expected-after-introduction-of/m-p/31707#M23092</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2022-01-13T17:52:10Z</dc:date>
    </item>
    <item>
      <title>Re: to_date not functioning as expected after introduction of arbitrary replaceWhere in Databricks 9.1 LTS</title>
      <link>https://community.databricks.com/t5/data-engineering/to-date-not-functioning-as-expected-after-introduction-of/m-p/31708#M23093</link>
      <description>&lt;P&gt;strange,&lt;/P&gt;&lt;P&gt;have you tried to create a column with to_date in your dataframe, so not in the replaceWhere condition?&lt;/P&gt;&lt;P&gt;df2 = df1.withColumn(to_date(..., "yyyyMMdd")) something like that&lt;/P&gt;</description>
      <pubDate>Fri, 14 Jan 2022 08:10:42 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/to-date-not-functioning-as-expected-after-introduction-of/m-p/31708#M23093</guid>
      <dc:creator>-werners-</dc:creator>
      <dc:date>2022-01-14T08:10:42Z</dc:date>
    </item>
    <item>
      <title>Re: to_date not functioning as expected after introduction of arbitrary replaceWhere in Databricks 9.1 LTS</title>
      <link>https://community.databricks.com/t5/data-engineering/to-date-not-functioning-as-expected-after-introduction-of/m-p/31709#M23094</link>
      <description>&lt;P&gt;Hi @Werner Stinckens​&amp;nbsp;, the to_date function works fine everywhere else like filter, withColumn etc. If you meant to create a new column in the dataframe and then use it in the replaceWhere - it's not something I want to do, because to do that I must add this new column in the target delta table also. &lt;/P&gt;</description>
      <pubDate>Fri, 14 Jan 2022 09:04:04 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/to-date-not-functioning-as-expected-after-introduction-of/m-p/31709#M23094</guid>
      <dc:creator>kolangareth</dc:creator>
      <dc:date>2022-01-14T09:04:04Z</dc:date>
    </item>
    <item>
      <title>Re: to_date not functioning as expected after introduction of arbitrary replaceWhere in Databricks 9.1 LTS</title>
      <link>https://community.databricks.com/t5/data-engineering/to-date-not-functioning-as-expected-after-introduction-of/m-p/31710#M23095</link>
      <description>&lt;P&gt;my only guess is that either the replaceWhere cannot take expressions, or the expression is faulty.&lt;/P&gt;</description>
      <pubDate>Fri, 14 Jan 2022 10:43:36 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/to-date-not-functioning-as-expected-after-introduction-of/m-p/31710#M23095</guid>
      <dc:creator>-werners-</dc:creator>
      <dc:date>2022-01-14T10:43:36Z</dc:date>
    </item>
    <item>
      <title>Re: to_date not functioning as expected after introduction of arbitrary replaceWhere in Databricks 9.1 LTS</title>
      <link>https://community.databricks.com/t5/data-engineering/to-date-not-functioning-as-expected-after-introduction-of/m-p/31711#M23096</link>
      <description>&lt;P&gt;I think it is neither. &lt;/P&gt;&lt;UL&gt;&lt;LI&gt;replaceWhere can take expressions, because I've tried using concat without to_date and it doesn't fail. &lt;/LI&gt;&lt;LI&gt;The expression isn't faulty, because it works with the spark conf setting to fallback to old behaviour and also the same expression works in filter, withColumn.&lt;/LI&gt;&lt;/UL&gt;</description>
      <pubDate>Fri, 14 Jan 2022 11:22:53 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/to-date-not-functioning-as-expected-after-introduction-of/m-p/31711#M23096</guid>
      <dc:creator>kolangareth</dc:creator>
      <dc:date>2022-01-14T11:22:53Z</dc:date>
    </item>
    <item>
      <title>Re: to_date not functioning as expected after introduction of arbitrary replaceWhere in Databricks 9.1 LTS</title>
      <link>https://community.databricks.com/t5/data-engineering/to-date-not-functioning-as-expected-after-introduction-of/m-p/31712#M23097</link>
      <description>&lt;P&gt;Can you try to generate that column earlier:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;.withColumn("conditionColumn", to_date...&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;to simplify code so replaceWhere will be easier to debug as now it is complaining about to_date&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 14 Jan 2022 14:00:04 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/to-date-not-functioning-as-expected-after-introduction-of/m-p/31712#M23097</guid>
      <dc:creator>Hubert-Dudek</dc:creator>
      <dc:date>2022-01-14T14:00:04Z</dc:date>
    </item>
    <item>
      <title>Re: to_date not functioning as expected after introduction of arbitrary replaceWhere in Databricks 9.1 LTS</title>
      <link>https://community.databricks.com/t5/data-engineering/to-date-not-functioning-as-expected-after-introduction-of/m-p/31713#M23098</link>
      <description>&lt;P&gt;Hi @Hubert Dudek​&amp;nbsp;, to do that I must also add this new column in the target delta table, which I don't want to do. Adding this new column only in the dataframe to write will not work because of two reasons,&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;delta will give a schema mismacth error while writing.&lt;/LI&gt;&lt;LI&gt;I think the replaceWhere condition is basically applied on the target delta table and not on the input dataframe. So the condition should be on some columns which is already existing in the table.&lt;/LI&gt;&lt;/OL&gt;</description>
      <pubDate>Fri, 14 Jan 2022 14:20:00 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/to-date-not-functioning-as-expected-after-introduction-of/m-p/31713#M23098</guid>
      <dc:creator>kolangareth</dc:creator>
      <dc:date>2022-01-14T14:20:00Z</dc:date>
    </item>
    <item>
      <title>Re: to_date not functioning as expected after introduction of arbitrary replaceWhere in Databricks 9.1 LTS</title>
      <link>https://community.databricks.com/t5/data-engineering/to-date-not-functioning-as-expected-after-introduction-of/m-p/31714#M23099</link>
      <description>&lt;P&gt;yes but you can at least do it for testing even with small copy of dataframe, this way is easier to find the problem&lt;/P&gt;</description>
      <pubDate>Fri, 14 Jan 2022 14:54:19 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/to-date-not-functioning-as-expected-after-introduction-of/m-p/31714#M23099</guid>
      <dc:creator>Hubert-Dudek</dc:creator>
      <dc:date>2022-01-14T14:54:19Z</dc:date>
    </item>
    <item>
      <title>Re: to_date not functioning as expected after introduction of arbitrary replaceWhere in Databricks 9.1 LTS</title>
      <link>https://community.databricks.com/t5/data-engineering/to-date-not-functioning-as-expected-after-introduction-of/m-p/31715#M23100</link>
      <description>&lt;P&gt;@Prasanth Kolangareth​&amp;nbsp;- Does Hubert's answer resolve the problem for you?  If yes, would you be happy to mark it as best so that other members can find the solution more quickly?&lt;/P&gt;</description>
      <pubDate>Wed, 26 Jan 2022 16:32:41 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/to-date-not-functioning-as-expected-after-introduction-of/m-p/31715#M23100</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2022-01-26T16:32:41Z</dc:date>
    </item>
    <item>
      <title>Re: to_date not functioning as expected after introduction of arbitrary replaceWhere in Databricks 9</title>
      <link>https://community.databricks.com/t5/data-engineering/to-date-not-functioning-as-expected-after-introduction-of/m-p/110537#M43602</link>
      <description>&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;SELECT&lt;/SPAN&gt;&lt;SPAN&gt; TO_DATE(&lt;/SPAN&gt;&lt;SPAN&gt;'20250217'&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;SPAN&gt;'YYYYMMDD'&lt;/SPAN&gt;&lt;SPAN&gt;); gives the error: PARSE_SYNTAX_ERROR&amp;nbsp; syntax error at or near 'select'. sqlstate: 42601.&amp;nbsp; It datagrip, it works no problem and displays the date.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Tue, 18 Feb 2025 22:01:11 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/to-date-not-functioning-as-expected-after-introduction-of/m-p/110537#M43602</guid>
      <dc:creator>ltreweek</dc:creator>
      <dc:date>2025-02-18T22:01:11Z</dc:date>
    </item>
    <item>
      <title>Re: to_date not functioning as expected after introduction of arbitrary replaceWhere in Databricks 9</title>
      <link>https://community.databricks.com/t5/data-engineering/to-date-not-functioning-as-expected-after-introduction-of/m-p/111887#M44030</link>
      <description>&lt;P&gt;I'm not sure what runtime you're on, but if you're using a runtime with spark &amp;gt;= 3.0 this is a datetime pattern issue. Databricks SQL uses the following for datetime patterns&amp;nbsp;&lt;A href="https://docs.databricks.com/aws/en/sql/language-manual/sql-ref-datetime-pattern" target="_blank"&gt;Datetime patterns | Databricks Documentation&lt;/A&gt;&amp;nbsp;in the latest runtimes.&lt;BR /&gt;&lt;BR /&gt;In your case I believe you'd want to change your statement to the following:&lt;BR /&gt;&lt;SPAN&gt;SELECT&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;TO_DATE(&lt;/SPAN&gt;&lt;SPAN&gt;'20250217'&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;SPAN&gt;'yyyyMMdd'&lt;/SPAN&gt;&lt;SPAN&gt;);&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 06 Mar 2025 02:49:04 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/to-date-not-functioning-as-expected-after-introduction-of/m-p/111887#M44030</guid>
      <dc:creator>array-admiral</dc:creator>
      <dc:date>2025-03-06T02:49:04Z</dc:date>
    </item>
  </channel>
</rss>

