cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Creating an external table reference vs creating a view

najmead
Contributor

In a practical sense, what is the difference between creating an external table;

create table my_catalog.my_schema.my_favourite_table

location 'abfss://path/to/my/data

versus creating a view that references the same dataset;

create view my_catalog.my_schema.my_favourite_table as

select *

from parquet.'abfss://path/to/my/data

Neither option is materializing the data itself. Both are just pointers to a location in the datalake. Are there differences under the hood from databricks perspective that might impact on performance?

2 REPLIES 2

Anonymous
Not applicable

@Nicholas Mead​ :

External tables are metadata definitions that map to files in a data lake, and the files are read when queries are executed against the table. This means that external tables can benefit from partitioning, indexing, and caching to improve query performance. External tables can also be used in joins, and they can be queried using SQL and Spark APIs.

Views, on the other hand, are virtual tables that are defined by a SQL query. When a view is queried, the SQL query is executed against the data lake files. Views can be useful for simplifying complex queries or hiding details about the underlying data. However, they do not support partitioning, indexing, or caching, and they cannot be used in joins.

In terms of performance, external tables are generally faster than views because they can take advantage of partitioning and caching.Overall, creating an external table is usually the better option if you want to optimize query performance and take advantage of advanced features like partitioning and caching. However, if you just need a simple view to simplify a complex query, then creating a view can be a good option

Anonymous
Not applicable

Hi @Nicholas Mead​ 

Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.

Please help us select the best solution by clicking on "Select As Best" if it does.

Your feedback will help us ensure that we are providing the best possible service to you. Thank you!

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group