<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Glue database and saveAsTable in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/glue-database-and-saveastable/m-p/82358#M36624</link>
    <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/115018"&gt;@mexcram&lt;/a&gt;,&amp;nbsp;When saving a DataFrame as a Delta Table to S3 and AWS Glue using PySpark's `saveAsTable`, changing the `path` option or argument often results in the Glue table location being set to a placeholder path (e.g., `s3://my-bucket/my_table-__PLACEHOLDER__`), even though you can still query the table. This issue occurs because the `path` change is not properly reflected in Glue's metadata. The current workaround is to manually update the location using boto3 after saving the table. Unfortunately, as of now, there's no built-in way to directly make `saveAsTable` update the path as expected, so continuing with the boto3 workaround is advisable.&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;If you need further assistance, feel free to ask!&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Thu, 08 Aug 2024 10:55:03 GMT</pubDate>
    <dc:creator>Retired_mod</dc:creator>
    <dc:date>2024-08-08T10:55:03Z</dc:date>
    <item>
      <title>Glue database and saveAsTable</title>
      <link>https://community.databricks.com/t5/data-engineering/glue-database-and-saveastable/m-p/82067#M36503</link>
      <description>&lt;P&gt;Hello all,&lt;/P&gt;&lt;P&gt;I am saving my data frame as a Delta Table to S3 and AWS Glue using pyspark and `saveAsTable`, so far I can do this but something curious happens when I try to change the `path` (as an option or as an argument of `saveAsTable`).&lt;/P&gt;&lt;P&gt;The location in my Glue table is not updated to the correct path, instead it adds the suffix __PLACEHOLDER__, for example if I want to save the data frame as `my_table` in a bucket `s3://my-bucket/data/my_table` on the Glue table I will see the location as `s3://my-bucket/my_table-__PLACEHOLDER__`. However I still can query my table through SQL or pyspark.&lt;/P&gt;&lt;P&gt;My current workaround is to save the table and next to update the location on Glue using boto3.&lt;/P&gt;&lt;P&gt;Do you know if it is possible to make `saveAsTable` work as expected ? Or do you have another workaround ?&lt;/P&gt;</description>
      <pubDate>Tue, 06 Aug 2024 15:23:56 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/glue-database-and-saveastable/m-p/82067#M36503</guid>
      <dc:creator>mexcram</dc:creator>
      <dc:date>2024-08-06T15:23:56Z</dc:date>
    </item>
    <item>
      <title>Re: Glue database and saveAsTable</title>
      <link>https://community.databricks.com/t5/data-engineering/glue-database-and-saveastable/m-p/82358#M36624</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/115018"&gt;@mexcram&lt;/a&gt;,&amp;nbsp;When saving a DataFrame as a Delta Table to S3 and AWS Glue using PySpark's `saveAsTable`, changing the `path` option or argument often results in the Glue table location being set to a placeholder path (e.g., `s3://my-bucket/my_table-__PLACEHOLDER__`), even though you can still query the table. This issue occurs because the `path` change is not properly reflected in Glue's metadata. The current workaround is to manually update the location using boto3 after saving the table. Unfortunately, as of now, there's no built-in way to directly make `saveAsTable` update the path as expected, so continuing with the boto3 workaround is advisable.&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;If you need further assistance, feel free to ask!&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 08 Aug 2024 10:55:03 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/glue-database-and-saveastable/m-p/82358#M36624</guid>
      <dc:creator>Retired_mod</dc:creator>
      <dc:date>2024-08-08T10:55:03Z</dc:date>
    </item>
    <item>
      <title>Re: Glue database and saveAsTable</title>
      <link>https://community.databricks.com/t5/data-engineering/glue-database-and-saveastable/m-p/82382#M36630</link>
      <description>&lt;P&gt;Thank you, I will continue doing this.&lt;/P&gt;</description>
      <pubDate>Thu, 08 Aug 2024 12:11:56 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/glue-database-and-saveastable/m-p/82382#M36630</guid>
      <dc:creator>mexcram</dc:creator>
      <dc:date>2024-08-08T12:11:56Z</dc:date>
    </item>
  </channel>
</rss>

