- 11084 Views
- 9 replies
- 1 kudos
Hi,I am trying to write the contents of a dataframe into a parquet table using the command below.df.write.mode("overwrite").format("parquet").saveAsTable("sample_parquet_table")The dataframe contains an extract from one of our source systems, which h...
- 11084 Views
- 9 replies
- 1 kudos
Latest Reply
Great discussion on performance optimization! Managing technical projects like these alongside academic work can be demanding. If you need expert academic support to free up time for your professional pursuits, Dissertation Help Services is here to a...
8 More Replies
- 3838 Views
- 3 replies
- 3 kudos
Hi,We have a parquet table (folder) in Azure Storage Account.The table is partitioned by column PeriodId (represents a day in the format YYYYMMDD) and has data from 20181001 until 20211121 (yesterday).We have a new development that adds a new column ...
- 3838 Views
- 3 replies
- 3 kudos
Latest Reply
I think problem is in overwrite as when you overwrite it overwrites all folders. Solution is to mix append with dynamic overwrite so it will overwrite only folders which have data and doesn't affect old partitions:spark.conf.set("spark.sql.sources.pa...
2 More Replies
- 1739 Views
- 1 replies
- 0 kudos
After I vacuum the tables, do i need to update the manifest table and parquet table to refresh my external tables for integrations to work?
- 1739 Views
- 1 replies
- 0 kudos
Latest Reply
Manifest files need to be re-created when partitions are added or altered. Since a VACUUM only deletes all historical versions, you shouldn't need to create an updated manifest file unless you are also running an OPTIMIZE.