Databricks Community

ac0 · ‎01-11-2024

I'm trying to use the Global Init Scripts in Databricks to set an environment variable to use in a Delta Live Table Pipeline. I want to be able to reference a value passed in as a path versus hard coding it. Here is the code for my pipeline:

CREATE STREAMING LIVE TABLE data
COMMENT "Raw data in delta format"
TBLPROPERTIES ("quality" = "bronze", "pipelines.autoOptimize.zOrderCols" = "id")
AS
SELECT *, id FROM cloud_files(
  "${TEST_VAR}data/files", "json", map("cloudFiles.inferColumnTypes", "true")
  )

However, when I set up a global init script like below, it doesn't appear on the list of environment variables on the job compute cluster.

#!/bin/sh

sudo echo TEST_VAR=TESTING >> /etc/environment

Is this because the cluster type is PIPELINE? Is what I am attempting to do possible? Are Global Init Scripts even run when using Delta Live Table Pipelines, can can environment variables be referenced in SQL-style pipelines? I am finding little documentation about this online.