cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

docs.databricks.com

User16790091296
Contributor II

What is Databricks Database?

A Databricks database is a collection of tables. A Databricks table is a collection of structured data. You can cache, filter, and perform any operations supported by Apache Spark DataFrames on Databricks tables. You can query tables with Spark APIs and Spark SQL.

There are two types of tables: global and local. A global table is available across all clusters. Databricks registers global tables either to the Databricks Hive metastore or to an external Hive metastore. A local table is not accessible from other clusters and is not registered in the Hive metastore. This is also known as a temporary view.

You can create a table using the Create Table UI or programmatically. A table can be populated from files in DBFS or data stored in any of the supported data sources.

Managed and Unmanaged Tables

Every Spark SQL table has metadata information that stores the schema and the data itself.

A managed table is a Spark SQL table for which Spark manages both the data and the metadata. In the case of managed table, Databricks stores the metadata and data in DBFS in your account. Since Spark SQL manages the tables, doing a DROP TABLE example_data deletes both the metadata and data.

Another option is to let Spark SQL manage the metadata, while you control the data location. We refer to this as an unmanaged table. Spark SQL manages the relevant metadata, so when you perform DROP TABLE <example-table>, Spark removes only the metadata and not the data itself. The data is still present in the path you provided.

You can create an unmanaged table with your data in data sources such as Cassandra, JDBC table, and so on. See Data sources for more information about the data sources supported by Databricks.

More info here: https://docs.databricks.com/data/tables.html

0 REPLIES 0
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!