02-06-2024 01:37 AM
We are looking into the use of managed tables on databricks. As this decision won’t be easy to reverse I am reaching out to all of you fine folks to learn more about your experience with using this.
If I understand correctly we dont have to deal with manageing the storage as databricks will make guids for schemas and tables. The readability will be worse on the storage it self (usning ADLS at the moment) but I dont think that matters so much as we will still have good readability within the databricks environment.
Together with the managed tables we were thinking to use tags together with the built in metadata so we can build and share the three structure if needed.
What is the pros and cons of managed tables?
What are some things I should look into before deciding?
02-06-2024 03:31 AM
Managed tables are the tables which are completely managed by databricks, i.e. If we drop the table from the databricks the underlying files will be also deleted.
Ideally it should be used in the following cases:
02-06-2024 03:49 AM - edited 02-06-2024 03:58 AM
Thanks for your response @Hkesharwani
In what scenario will we need to drop tables? Cant we just avoid giving drop table privileges to our analysts, superusers and users?
Our current thought is that we will manage access and data lifecycle anyways.
In addition, cant we just use the undrop command within 7 days? (we are using UC)
UNDROP TABLE | Databricks on AWS
02-06-2024 06:49 AM
Hi cltj,
As I mentioned that you may drop tables when you have to only save data for temp purpose. And yes you can only grant required access to the team.
I believe https://docs.databricks.com/en/sql/language-manual/sql-ref-privileges.html this will be a great help for you.
02-06-2024 07:37 AM
I would recommend using managed tables for table backups and tables used for data processing in the notebooks that can be dropped at the end of the process or kind of staging table. I have not explored how to copy a managed table from Dev to QA Environment. Incase of external table , we can copy the storage folder from one Dev Storage Account to QA Storage Account and create the DDL.
02-12-2024 08:56 AM
Hey there! Thanks a bunch for being part of our awesome community! 🎉
We love having you around and appreciate all your questions. Take a moment to check out the responses – you'll find some great info. Your input is valuable, so pick the best solution for you. And remember, if you ever need more help , we're here for you!
Keep being awesome! 😊🚀
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group