12-17-2022 05:17 PM
Data Engineering - CTAS - External Tables
Details:
CREATE TABLE IF NOT EXISTS sales_csv
(order_id LONG, email STRING, transactions_timestamp LONG, total_item_quantity INTEGER, purchase_revenue_in_usd DOUBLE, unique_items INTEGER, items STRING)
USING CSV
OPTIONS (
header = "true",
delimiter = "|"
)
LOCATION "${DA.paths.sales_csv}"
CREATE OR REPLACE TABLE sales_unparsed AS
SELECT * FROM csv.`${da.paths.datasets}/ecommerce/raw/sales-csv`;
Please help me understand.
12-18-2022 12:15 PM
The 2nd statement CTAS will not be able to parse the csv in any manner because it's just the from statement that points to a file. It's more of a traditional SQL statement with select and from. It will create a Delta Table. This just happens to be the way the syntax works with AS not allowing for options.
12-18-2022 12:15 PM
The 2nd statement CTAS will not be able to parse the csv in any manner because it's just the from statement that points to a file. It's more of a traditional SQL statement with select and from. It will create a Delta Table. This just happens to be the way the syntax works with AS not allowing for options.
12-18-2022 12:53 PM
Ahh the devil was in the details. The statements differ in USING CSV -> external table and CTAS + SELECT FROM-> managed table. I hope I got it right.
Thanks a lot for explanation. Cheers!
12-18-2022 04:11 PM
Glad I could help
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group