cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Best Practices for naming Tables and Databases in Databricks

Spauk
New Contributor II

We moved in Databricks since few months from now, and before that we were in SQL Server.

So, all our tables and databases follow the "camel case" rule.

Apparently, in Databricks the rule is "lower case with underscore".

Where can we find an official documentation that says that, to show it to our management please?

Because without this doc, they never let us change a thing.

Thanks.

1 ACCEPTED SOLUTION

Accepted Solutions

Hubert-Dudek
Esteemed Contributor III

@Landan Georgeโ€‹  You are right, so naming everything in lowercase and low dash is the only method that makes sense. @Salah KHALFALLAHโ€‹ maybe you can use that document https://api-docs.databricks.com/rest/latest/unity-catalog-api-specification.html as there is written :

"Names supplied by users are converted to lower-case by DBR clients (before they are sent to the UC API) . Also, input names (for all object types except Table Column Names) are converted to lower-case by the UC server, to handle the case that UC objects are created via directly accessing the UC API. With this conversion to lower-case names, the name handling is effectively case-insensitive. I.e., if a user creates a table with relative name โ€œ******โ€, it would conflict with an existing table named โ€œ******โ€."

View solution in original post

5 REPLIES 5

Hubert-Dudek
Esteemed Contributor III

I think it is up to your decision. I prefer lowercase but actually companies for which I am working are not using them in databricks.

LandanG
Honored Contributor
Honored Contributor

Hi @Salah KHALFALLAHโ€‹ , looking at the documentation it appears that Databricks' preferred naming convention is lowercase and underscores as you mentioned.

The reason for this is most likely because Databricks uses Hive Metastore, which is case insensitive, so querying "MyTable" is the same as "mytable" and "MYTABLE" and will be displayed as "mytable" in the data browser window, so camel case may not be that helpful when naming objects.

Hubert-Dudek
Esteemed Contributor III

@Landan Georgeโ€‹  You are right, so naming everything in lowercase and low dash is the only method that makes sense. @Salah KHALFALLAHโ€‹ maybe you can use that document https://api-docs.databricks.com/rest/latest/unity-catalog-api-specification.html as there is written :

"Names supplied by users are converted to lower-case by DBR clients (before they are sent to the UC API) . Also, input names (for all object types except Table Column Names) are converted to lower-case by the UC server, to handle the case that UC objects are created via directly accessing the UC API. With this conversion to lower-case names, the name handling is effectively case-insensitive. I.e., if a user creates a table with relative name โ€œ******โ€, it would conflict with an existing table named โ€œ******โ€."

LandanG
Honored Contributor
Honored Contributor

@Hubert Dudekโ€‹ That's a good link, thanks for adding it

Spauk
New Contributor II

Thank you very much!

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!