Re: Bug Report: SDP (DLT) with autoloader not pass...

ChrisLawford_n1 · ‎01-15-2026

I am noticing a difference between using autoloader in an interactive notebook vs using it in a Spark Declarative Pipeline (DLT Pipeline). This issue seems to be very similar to this other unanswered post from a few years ago. Bug report: the delimiter option does not work when run on DLT
The CSV I am trying to ingest uses the pipe character (|) as its separator.

```

df = (spark.readStream
.format("cloudFiles")
.option("cloudFiles.format", "csv")
.option("delimiter", "|")
.option("cloudFiles.schemaLocation", f"/Volumes/{location}/schema")
.load(f"/Volumes/{path to csv}")
.writeStream
.option("checkpointLocation", f"/Volumes/{location}/checkpoint")
.trigger(availableNow=True)
.toTable(f"{table_location}")
)
```
^ running this in a notebook serverless has no issues.

``` python
@Dlt.table(
name=f"{bronze_raw_schema}.{table_name}",
)
def load_table():
return spark.readStream.format("cloudFiles")
.format("cloudFiles")
.option("cloudFiles.format", "csv")
.option("delimiter", "|")
.option("header", "true")
.option("cloudFiles.includeExistingFiles", "true")
.load(f"/Volumes/{path to csv}")
```

When I run the above in a SDP (DLT) pipeline I get the following error. Which appears to be failing to delimit on the pipe (|) character and is thus hitting a length limit on the header column names.

nikhilj0421 · ‎01-18-2026

Hi @ChrisLawford_n1, can you try doing a full refresh of this table? The old schema may still be in the schema location.

ChrisLawford_n1 · ‎01-19-2026

Hey,
Okay thanks @nikhilj0421. I have now solved the issue but not with a full refresh of the table. I had tried this previously and even deleted the DLT pipeline hoping that would provide me the clean slate if this lingering schema was an issue but when re-running the pipeline it still failed with the same issue. The solution was to change the output schema(in my case to {schema_name}_2) of the table/pipeline and run again. I was then able to again alter the schema back to the original schema_name and this time everything worked without a problem.

Overall I have learned that DLT pipelines must store more information than I previously thought when deleting them or full refreshing them.

View solution in original post

Bug Report: SDP (DLT) with autoloader not passing through pipe delimiter/separator