Your error suggests that while your DLT pipeline works for materialized views (batch reads), switching to a streaming table using Autoloader (readStream) is triggering an ADLS Gen2 authentication failure, specifically "Could not find ADLS Gen2 Token" in a streaming context.
Why This Happens
Autoloader (readStream) in DLT pipelines primarily relies on Spark Structured Streaming. Accessing ADLS Gen2 storage with AAD Passthrough works for batch queries and mounts, but streaming queries need continuous re-authentication. The ADLS token may not persist or refresh correctly, and some APIs are sensitive to the authentication context, especially in DLT pipelines, which run in a managed service context.
Troubleshooting & Solutions
1. Use Service Principal or Managed Identity
-
AAD Passthrough limitations: For streaming (Autoloader), AAD Passthrough sometimes fails because tokens are not refreshed correctly for long-running streams.
-
Best practice is configuring the DLT pipeline to use a service principal (via Spark configs) or a managed identity for Databricks. This avoids dependencies on user tokens.
2. Storage Mounts and Streaming
-
While mounting works for batch jobs, mounts are not supported for streaming with Autoloader. Always use the direct abfss:// path rather than /mnt/path for streaming.
3. Pipeline Configuration
-
Ensure these Spark configs are set, either in the notebook or DLT pipeline settings (replace placeholders):
spark.conf.set("fs.azure.account.auth.type.<STORAGE_ACCOUNT>.dfs.core.windows.net", "OAuth")
spark.conf.set("fs.azure.account.oauth.provider.type.<STORAGE_ACCOUNT>.dfs.core.windows.net",
"org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider")
spark.conf.set("fs.azure.account.oauth2.client.id.<STORAGE_ACCOUNT>.dfs.core.windows.net", "<CLIENT_ID>")
spark.conf.set("fs.azure.account.oauth2.client.secret.<STORAGE_ACCOUNT>.dfs.core.windows.net", "<CLIENT_SECRET>")
spark.conf.set("fs.azure.account.oauth2.client.endpoint.<STORAGE_ACCOUNT>.dfs.core.windows.net",
"https://login.microsoftonline.com/<TENANT_ID>/oauth2/token")
-
These configs require you to register a service principal or use Databricks managed identity, not your own credentials.
4. Autoloader Options
-
Confirm you are using abfss://container@account.dfs.core.windows.net/... in your .load() call rather than a mount path.
-
Check with your Azure admin/support to see if you can provision a service principal or managed identity for Databricks.
5. Permissions
-
You’ll need Data Contributor access to the ADLS Gen2 storage account for streaming operations via service principal or managed identity.
What You Can Do
-
Contact your Azure admin to configure a service principal or assign managed identity permissions to the Databricks workspace.
-
Update your DLT pipeline configuration to use direct ADLS Gen2 paths and appropriate credential configs.
-
Avoid relying solely on AAD passthrough for structured streaming workloads in DLT.
Additional Resources
For clarity and step-by-step configuration, review: