cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

How to extract DDL from tables/views using a SQL query?

RichardSCarchit
New Contributor III

I want to write a SQL query that queries the information_schema to generate a list of objects, their columns, relationships etc. - basically a data dictionary.

For each object I want to show the DDL code, and I know I can get it by executing

show create table [tablename]

but this command only runs as standalone, not as subquery within DML.

I can't manually repeat 'show create table' for hundreds of tables each time, so is there a way to get to the object DDL within a SQL query? For example in Oracle I could use a subquery

SELECT DBMS_METADATA.get_ddl etc. etc.

Can't use Python for a number of reasons. Thanks for any suggestion!

4 REPLIES 4

pvignesh92
Honored Contributor

@Richard Architectโ€‹ Hi, If you are using Unity Catalog, then the Information Schema is available which you can use to get a set of views describing the objects known to the schemaโ€™s catalog. But in that as well, you can only get View definition and not table DDL I believe.

But the table DDL can be obtained programmatically by using the below code. Please see if this helps.

dbs = spark.catalog.listDatabases()
for db in dbs:
  tables = spark.catalog.listTables(db.name)
  for t in tables:
    DDL = spark.sql("SHOW CREATE TABLE {}.{}".format(db.name, t.name))
    print(DDL.first()[0])

Anonymous
Not applicable

Hi @Richard Architectโ€‹ 

Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. 

We'd love to hear from you.

Thanks!

RichardSCarchit
New Contributor III

Thanks but per my original post I cannot run Python in my cluster, only SQL. Above code throws an error.

Anonymous
Not applicable

Hi @Richard Architectโ€‹ 

I'm sorry you could not find a solution to your problem in the answers provided.

Our community strives to provide helpful and accurate information, but sometimes an immediate solution may only be available for some issues.

I suggest providing more information about your problem, such as specific error messages, error logs or details about the steps you have taken. This can help our community members better understand the issue and provide more targeted solutions.

Alternatively, you can consider contacting the support team for your product or service. They may be able to provide additional assistance or escalate the issue to the appropriate section for further investigation.

Thank you for your patience and understanding, and please let us know if there is anything else we can do to assist you.