cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Experiences using managed tables

cltj
New Contributor III

We are looking into the use of managed tables on databricks. As this decision wonโ€™t be easy to reverse I am reaching out to all of you fine folks to learn more about your experience with using this.

If I understand correctly we dont have to deal with manageing the storage as databricks will make guids for schemas and tables. The readability will be worse on the storage it self (usning ADLS at the moment) but I dont think that matters so much as we will still have good readability within the databricks environment. 
Together with the managed tables we were thinking to use tags together with the built in metadata so we can build and share the three structure if needed. 

What is the pros and cons of managed tables?
What are some things I should look into before deciding?

4 REPLIES 4

Hkesharwani
Contributor II

Managed tables are the tables which are completely managed by databricks, i.e. If we drop the table from the databricks the underlying files will be also deleted.   
Ideally it should be used in the following cases:

  • if you have temporary data that is not critical to your long-term storage or analysis.
  • If you have ad-hoc analysis scenarios where data is not required to persist beyond the scope of the analysis, you can use managed tables.
  • If multiple users or teams need to access and work with the same table, it's recommended to use external tables instead of managed tables. External tables provide more flexibility in terms of data sharing and access control.
Harshit Kesharwani
Data engineer at Rsystema

cltj
New Contributor III

Thanks for your response @Hkesharwani 
In what scenario will we need to drop tables? Cant we just avoid giving drop table privileges to our analysts, superusers and users? 

Our current thought is that we will manage access and data lifecycle anyways. 

In addition, cant we just use the undrop command within 7 days? (we are using UC)
UNDROP TABLE | Databricks on AWS


 

Hkesharwani
Contributor II

Hi cltj,
As I mentioned that you may drop tables when you have to only save data for temp purpose. And yes you can only grant required access to the team.
I believe https://docs.databricks.com/en/sql/language-manual/sql-ref-privileges.html this will be a great help for you.

Harshit Kesharwani
Data engineer at Rsystema

Ramakrishnan83
New Contributor III

I would recommend using managed tables for table backups and tables used for data processing in the notebooks that can be dropped at the end of the process or kind of staging table. I have not explored how to copy a managed table from Dev to QA Environment. Incase of external table , we can copy the storage folder from one Dev Storage Account to QA Storage Account and create the DDL.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group