Deployment-ready sample source-code for Delta Live Table & Autoloader

Sanjay_AMP
New Contributor II

Hi all,

We are planning to develop an Autoloader based DLT Pipeline that needs to be

  • Deployable via a CI/CD Pipeline
  • Observable

Can somebody please point me to source-code that we can start with a firm foundation instead of falling into a newbie-pattern ?

Thanks in advance

Sanjay

Priyanka_Biswas
Databricks Employee
Databricks Employee

Hi @Sanjay_AMP 

Delta Live Tables and AutoLoader can be used together to incrementally ingest data from cloud object storage.
• Python code example:
 - Define a table called "customers" that reads data from a CSV file in cloud object storage.
 - Define a table called "sales_orders_raw" that reads data from a JSON file in cloud object storage.
• SQL code example:
 - Create or refresh a streaming table called "customers" that selects all data from a CSV file in cloud object storage.
 - Create or refresh a streaming table called "sales_orders_raw" that selects all data from a JSON file in cloud object storage.
• Options can be passed to the cloud_files() method using the map() function.
• Schema can be specified for formats that don't support schema inference.
• Additional code examples and documentation can be found at the provided sources.
@dlt.table
def customers():
  return (
    spark.readStream.format("cloudFiles")
      .option("cloudFiles.format", "csv")
      .load("/databricks-datasets/retail-org/customers/")
  )

@dlt.table
def sales_orders_raw():
  return (
    spark.readStream.format("cloudFiles")
      .option("cloudFiles.format", "json")
      .load("/databricks-datasets/retail-org/sales_orders/")
  )