Glue Catalog Metadata Management with Enforced Tagging in Databricks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-16-2024 10:01 AM
As part of the data governance team, we're trying to enforce table-level tagging when users create tables in a Databricks environment where metadata is managed by AWS Glue Catalog (non-Unity Catalog). Is there a way to require tagging at table creation, or to set up a process that enforces tagging in such an environment? Any guidance on best practices or tools to implement this would be appreciated
- Labels:
-
Delta Lake
-
Spark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-23-2024 09:02 AM
Hi @KartRasi_10779,
How are you doing today?
As per my understanding, Consider creating a tagging enforcement process by implementing a custom script or trigger that checks for table tags post-creation, as AWS Glue Catalog doesn't natively support mandatory tagging at table creation. You could use AWS Lambda functions combined with Glue triggers to monitor for new table creation and automatically apply or validate tags. Another approach is to integrate Databricks jobs that run periodically to ensure all tables have the required tags and alert when tags are missing. Best practices would include setting up clear tagging policies and using automation tools like Databricks REST API or Glue Crawlers to ensure tags are consistently applied across all tables.
Give a try and let me know the outcome.
Hoping you have a good day.
Regards,
Brahma
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-28-2024 01:48 PM
You can use lakeFS pre-merge hooks to force this. Works great with this stack -> https://lakefs.io/blog/lakefs-hooks/

