cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Creating an external table reference vs creating a view

najmead
Contributor

In a practical sense, what is the difference between creating an external table;

create table my_catalog.my_schema.my_favourite_table

location 'abfss://path/to/my/data

versus creating a view that references the same dataset;

create view my_catalog.my_schema.my_favourite_table as

select *

from parquet.'abfss://path/to/my/data

Neither option is materializing the data itself. Both are just pointers to a location in the datalake. Are there differences under the hood from databricks perspective that might impact on performance?

2 REPLIES 2

Anonymous
Not applicable

@Nicholas Mead​ :

External tables are metadata definitions that map to files in a data lake, and the files are read when queries are executed against the table. This means that external tables can benefit from partitioning, indexing, and caching to improve query performance. External tables can also be used in joins, and they can be queried using SQL and Spark APIs.

Views, on the other hand, are virtual tables that are defined by a SQL query. When a view is queried, the SQL query is executed against the data lake files. Views can be useful for simplifying complex queries or hiding details about the underlying data. However, they do not support partitioning, indexing, or caching, and they cannot be used in joins.

In terms of performance, external tables are generally faster than views because they can take advantage of partitioning and caching.Overall, creating an external table is usually the better option if you want to optimize query performance and take advantage of advanced features like partitioning and caching. However, if you just need a simple view to simplify a complex query, then creating a view can be a good option

Anonymous
Not applicable

Hi @Nicholas Mead​ 

Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.

Please help us select the best solution by clicking on "Select As Best" if it does.

Your feedback will help us ensure that we are providing the best possible service to you. Thank you!

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.