Managing streaming checkpoints with unity catalog
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-26-2024 12:08 AM
This is partly a question, partly a feature request: How do you guys handle streaming checkpoints in combination with unity catalog managed tables?
It seems like the only way is to create a volume, and manually specify paths in it as streaming checkpoints. Do you use a single volume per catalog? A single volume per schema, or even one volume per table?
And how do you handle cleanup of the streaming checkpoints when you drop a table? You go in as admin and manually delete the streaming checkpoint?
So for the feature request to databricks:
For managed tables in unity catalog it would be great if there was a function where you could provide the catalog.schema.table, and a checkpoint-name, and it would provide a path you could use as a streaming-checkpoint location. If the table gets dropped then this location should also be deleted. Aka, managed tables should have the option of managed checkpoint locations.