Deployment-ready sample source-code for Delta Live Table & Autoloader
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-11-2023 06:52 AM
Hi all,
We are planning to develop an Autoloader based DLT Pipeline that needs to be
- Deployable via a CI/CD Pipeline
- Observable
Can somebody please point me to source-code that we can start with a firm foundation instead of falling into a newbie-pattern ?
Thanks in advance
Sanjay
1 REPLY 1
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-16-2023 03:38 PM
Hi @Sanjay_AMP
Delta Live Tables and AutoLoader can be used together to incrementally ingest data from cloud object storage.
• Python code example:
- Define a table called "customers" that reads data from a CSV file in cloud object storage.
- Define a table called "sales_orders_raw" that reads data from a JSON file in cloud object storage.
• SQL code example:
- Create or refresh a streaming table called "customers" that selects all data from a CSV file in cloud object storage.
- Create or refresh a streaming table called "sales_orders_raw" that selects all data from a JSON file in cloud object storage.
• Options can be passed to the cloud_files() method using the map() function.
• Schema can be specified for formats that don't support schema inference.
• Additional code examples and documentation can be found at the provided sources.
• Python code example:
- Define a table called "customers" that reads data from a CSV file in cloud object storage.
- Define a table called "sales_orders_raw" that reads data from a JSON file in cloud object storage.
• SQL code example:
- Create or refresh a streaming table called "customers" that selects all data from a CSV file in cloud object storage.
- Create or refresh a streaming table called "sales_orders_raw" that selects all data from a JSON file in cloud object storage.
• Options can be passed to the cloud_files() method using the map() function.
• Schema can be specified for formats that don't support schema inference.
• Additional code examples and documentation can be found at the provided sources.
@dlt.table def customers(): return ( spark.readStream.format("cloudFiles") .option("cloudFiles.format", "csv") .load("/databricks-datasets/retail-org/customers/") ) @dlt.table def sales_orders_raw(): return ( spark.readStream.format("cloudFiles") .option("cloudFiles.format", "json") .load("/databricks-datasets/retail-org/sales_orders/") )

