Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-05-2024 10:51 PM
Using CTAS (CREATE TABLE AS SELECT) might be a more robust solution for your use case:
- Independence: CTAS creates a new, independent copy of the data, avoiding dependencies on the source table
- Simplified access control: Access rights can be managed solely within the target environment.
- Flexibility: You can easily modify the table structure or apply transformations during the copy process.
Optimizing the Cloning Process
- Use Delta Lake's
CLONEcommand for efficient copying when appropriate. - Implement incremental updates to minimize data transfer and processing time for subsequent refreshes.
- Consider using Databricks Workflows to automate and schedule the cloning process
7.
Addressing Open Questions
- The proposed approach is viable, but consider using CTAS instead of shallow clones for better isolation and simpler access management.
- Access rights to underlying data files are indeed a concern with shallow clones. CTAS avoids this issue by creating independent copies.