Databricks Community

YSF · ‎04-30-2023

I'm trying to setup autoloader to read some csv files. I tried with both autoloader with the DLT decorator as well as just autoloader by itself. The first column of the data is called "run_id", when I do a spark.read.csv() directly on the file it comes in fine. When I use autoloader then it seems to see the first column as a curly brace "{" can't find anything online what that is or where it's coming from.

Here's a sample of the autoloader call:

df = (spark.readStream.format("cloudFiles")
      .option("cloudFiles.format", "csv")
      .option("cloudFiles.includeExistingFiles","true")
      .option("cloudFiles.schemaLocation", "/dbfs/schema_registry/")
      .load("/dbfs/mnt/folder/data_20230519.csv"))

Anyone know what's going on?