- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-03-2025 06:06 AM - edited 12-03-2025 06:07 AM
Hello @Rohit_hk ,
Below are the answers to your questions:
-
Is this the expected behaviour (schemaHints only accepts an inline string and not a file path)?
Ans:
Yes, cloudFiles.schemaHints is defined as a plain String option, and we can only pass DDL strings, not paths. There is no support for interpreting its value as a path to a JSON file .Doc :https://docs.databricks.com/aws/en/ingestion/cloud-object-storage/auto-loader/options
-
Is there any supported way in DLT SQL to:
-
load a schema definition from a JSON file, and
-
feed it into cloudFiles / schemaHints (or another option) without hardcoding the full “col dtype, col2 dtype…” string in the SQL?
-
Ans:
Not directly. In a pure SQL DLT (Lakeflow Declarative Pipelines) notebook:
- You cannot read an arbitrary JSON file from DBFS/S3 and inject its contents into cloud_files or schemaHints at runtime.
- The options map in cloud_files/read_files must be literals or parameter expansions, not the result of reading a file.
However, DLT / Lakeflow pipelines support parameters whose values are injected into your SQL as strings Doc: https://docs.databricks.com/aws/en/ldp/parameters
You can:
-
Put this in your pipeline configuration (UI/JSON):
"configuration": {
"myschema.my_table_hints": "id INT, sal DECIMAL(10,2), other_col STRING"
}-
Reference it in your SQL notebook:
CREATE OR REFRESH STREAMING LIVE TABLE bronze_my_table AS
SELECT *
FROM STREAM cloud_files(
's3://my-bucket/path',
'parquet',
map(
'cloudFiles.schemaHints', '${myschema.my_table_hints}',
'cloudFiles.schemaLocation', 'dbfs:/schemas/autoloader/my_table'
)
);