cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Unity Catalog Schema management

a_user12
Contributor

From time to time i read  articles such as here which suggest to use a unity catalog schema management tool. All table schema changes should be applied via this tool.

Usually SPs (or users) have the "Modify" Permission on tables. This allows to them to "update/insert/delete" some data and hence they need it. However, it also allows to update the schema (e.g. add new columns). Hence, such a schema managed tool can be easily bypassed right? I know there are options such as mergeSchema but at the end the visison of "all schema changes can only be done via a schema management tool" is not feasible as long as users have the modify permission.

1 ACCEPTED SOLUTION

Accepted Solutions

MoJaMa
Databricks Employee
Databricks Employee

I tend to mostly agree with you. Trying to do table-schema management like I would have done while developing ETL flows in an RDBMS world is quite different from trying to do this in a fast-moving "new-sources-all-the-time" data engineering world. 

There have been feature requests from customers to split the MANAGE into a DML vs DDL privilege. If that happens then your "data engineering" user/SP can be given the DML, while another "schema management" SP does the Liquibase-type schema management.

Note: This is more of a confusing overlap in lower environments. In higher environments you would be able to make sure that whatever gets deployed by the "data engineering SP" does not include DDL commands. In lower environments it becomes more of a "monitor and correct" motion to ensure your devs follow the best practice for table-schema management.

~Mohan Mathews.

View solution in original post

1 REPLY 1

MoJaMa
Databricks Employee
Databricks Employee

I tend to mostly agree with you. Trying to do table-schema management like I would have done while developing ETL flows in an RDBMS world is quite different from trying to do this in a fast-moving "new-sources-all-the-time" data engineering world. 

There have been feature requests from customers to split the MANAGE into a DML vs DDL privilege. If that happens then your "data engineering" user/SP can be given the DML, while another "schema management" SP does the Liquibase-type schema management.

Note: This is more of a confusing overlap in lower environments. In higher environments you would be able to make sure that whatever gets deployed by the "data engineering SP" does not include DDL commands. In lower environments it becomes more of a "monitor and correct" motion to ensure your devs follow the best practice for table-schema management.

~Mohan Mathews.