cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Best Practices for naming Tables and Databases in Databricks

Spauk
New Contributor II

We moved in Databricks since few months from now, and before that we were in SQL Server.

So, all our tables and databases follow the "camel case" rule.

Apparently, in Databricks the rule is "lower case with underscore".

Where can we find an official documentation that says that, to show it to our management please?

Because without this doc, they never let us change a thing.

Thanks.

1 ACCEPTED SOLUTION

Accepted Solutions

Hubert-Dudek
Esteemed Contributor III

@Landan Georgeโ€‹  You are right, so naming everything in lowercase and low dash is the only method that makes sense. @Salah KHALFALLAHโ€‹ maybe you can use that document https://api-docs.databricks.com/rest/latest/unity-catalog-api-specification.html as there is written :

"Names supplied by users are converted to lower-case by DBR clients (before they are sent to the UC API) . Also, input names (for all object types except Table Column Names) are converted to lower-case by the UC server, to handle the case that UC objects are created via directly accessing the UC API. With this conversion to lower-case names, the name handling is effectively case-insensitive. I.e., if a user creates a table with relative name โ€œ******โ€, it would conflict with an existing table named โ€œ******โ€."

View solution in original post

5 REPLIES 5

Hubert-Dudek
Esteemed Contributor III

I think it is up to your decision. I prefer lowercase but actually companies for which I am working are not using them in databricks.

LandanG
Databricks Employee
Databricks Employee

Hi @Salah KHALFALLAHโ€‹ , looking at the documentation it appears that Databricks' preferred naming convention is lowercase and underscores as you mentioned.

The reason for this is most likely because Databricks uses Hive Metastore, which is case insensitive, so querying "MyTable" is the same as "mytable" and "MYTABLE" and will be displayed as "mytable" in the data browser window, so camel case may not be that helpful when naming objects.

Hubert-Dudek
Esteemed Contributor III

@Landan Georgeโ€‹  You are right, so naming everything in lowercase and low dash is the only method that makes sense. @Salah KHALFALLAHโ€‹ maybe you can use that document https://api-docs.databricks.com/rest/latest/unity-catalog-api-specification.html as there is written :

"Names supplied by users are converted to lower-case by DBR clients (before they are sent to the UC API) . Also, input names (for all object types except Table Column Names) are converted to lower-case by the UC server, to handle the case that UC objects are created via directly accessing the UC API. With this conversion to lower-case names, the name handling is effectively case-insensitive. I.e., if a user creates a table with relative name โ€œ******โ€, it would conflict with an existing table named โ€œ******โ€."

LandanG
Databricks Employee
Databricks Employee

@Hubert Dudekโ€‹ That's a good link, thanks for adding it

Spauk
New Contributor II

Thank you very much!

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group