Re: Is there a way to automate Table creation in D...

BilalAslamDbrx · ‎10-04-2021

@Vasanth Kumar do you have new data arriving in these locations or not? In the case where you do NOT have new data arriving, you can simply run a COPY INTO command, pointing to the location. Example:

CREATE TABLE DataSubject1;
 
COPY INTO DataSubject1
FROM 'abfss://landing@xyz.dfs.core.windows.net/sc/raw/DataSubject1'
FILEFORMAT = PARQUET
FORMAT_OPTIONS (
  'inferSchema' = ' true',
  'mergeSchema' = true'
);

Now that you can run this command for one storage path, you can now template it to run for many storage paths. Probably the easiest way to do this is to use Python variable substitution to generate the SQL as a string and run it against a cluster.

PS: Don't forget to set the OWNER of the newly-created tables otherwise you won't see them in Databricks SQL (admins will see all newly-created tables)