Autoloader not ingesting all file data into Delta Table from Azure Blob Container

KristiLogos
Contributor
I have done the following, ie. crate a Delta Table where I plan to load the Azure Blob Container files that are .json.gz files:
 
df = spark.read.option("multiline", "true").json(f"{container_location}/*.json.gz")
 
 
DeltaTable.create(spark) \
    .addColumns(df.schema) \
    .property("delta.minReaderVersion", "2") \
    .property("delta.minWriterVersion", "5") \
    .property("delta.columnMapping.mode", "name") \
    .tableName('tablename') \
    .execute()
 
Then I set up the autloader:
df_autoloader = (spark.readStream
                 .format("cloudFiles")
                 .option("cloudFiles.resourceGroup", "resourcename")
                 .option("cloudFiles.subscriptionId", "12345")
                 .option("cloudFiles.tenantId", "12345")
                 .option("cloudFiles.clientId", "12345")
                 .option("cloudFiles.clientSecret", "12345")
                 .option("cloudFiles.format", "json")
                 .option("multiline", "true")  
                 .option("cloudFiles.useNotifications", "true")  
                 .schema(schema)  
                 .load(AMP_LOC)  # path to  Blob
)
 
(df_autoloader.writeStream
    .format("delta")
    .outputMode("append")
    .option("checkpointLocation", checkpoint_dir)  
    .table("tablename")
)
 
I see things happenign in the cell but when I go to query the table I only see 200 rows of data, when there should be millions.