<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: autoloader documentation does not work in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/autoloader-documentation-does-not-work/m-p/11436#M6415</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;It seems like you are writing to a path which is not empty and has some non - delta format files. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Also, can you confirm if the path mentioned in the error message "`s3://nbu-ml/projects/rca/msft/dsm09collectx/delta` " is the path you are writing to or reading from? I faced a similar error - but that was when I read a delta table path through :   .option("cloudFiles.format", "parquet"). I overcame the error by adding spark.databricks.delta.formatCheck.enabled=false in spark config.&lt;/P&gt;</description>
    <pubDate>Wed, 08 Feb 2023 20:38:52 GMT</pubDate>
    <dc:creator>Murthy1</dc:creator>
    <dc:date>2023-02-08T20:38:52Z</dc:date>
    <item>
      <title>autoloader documentation does not work</title>
      <link>https://community.databricks.com/t5/data-engineering/autoloader-documentation-does-not-work/m-p/11435#M6414</link>
      <description>&lt;P&gt;I am trying to following the documentation here:&lt;/P&gt;&lt;P&gt;&lt;A href="https://learn.microsoft.com/en-us/azure/databricks/getting-started/etl-quick-start" alt="https://learn.microsoft.com/en-us/azure/databricks/getting-started/etl-quick-start" target="_blank"&gt;https://learn.microsoft.com/en-us/azure/databricks/getting-started/etl-quick-start&lt;/A&gt;&lt;/P&gt;&lt;P&gt;My code looks like:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;(spark.readStream
  .format("cloudFiles")
  .option("header", "true")
  #.option("cloudFiles.partitionColumns", "date, hour")
  .option("cloudFiles.format", "csv")
  .option("cloudFiles.maxBytesPerTrigger", "10m")
  .option("cloudFiles.schemaHints", SCHEMA_HINT)
  .option("cloudFiles.schemaLocation", checkpoint_path)
  .option("cloudFiles.schemaEvolutionMode", "addNewColumns")
  .load(file_path)
  .withColumn('source_file', input_file_name())
  .withColumn('processing_time', current_timestamp())
  .withColumnRenamed("date","timestamp")
  .withColumnRenamed("FW_Version","fw_version_1")
  .withColumnRenamed('fw_version','fw_version_2') # &lt;A href="https://kb.databricks.com/en_US/sql/dupe-column-in-metadata" target="test_blank"&gt;https://kb.databricks.com/en_US/sql/dupe-column-in-metadata&lt;/A&gt;
  .withColumnRenamed('Time_since_last_clear_[Min]', 'Time_since_last_clear_min') # delta does not like column names with brackets
  .writeStream
  .format("delta")
  .option("checkpointLocation", checkpoint_path)
  .option("path", delta_path)
  .trigger(availableNow=True)
  .toTable(table_name))&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;(I have commented out the partition option because one of my original columns has the same name as the partition so it is overwritten. Could not find a workaround.)&lt;/P&gt;&lt;P&gt;However, it does not work.&lt;/P&gt;&lt;P&gt;I get the following error:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;AnalysisException: Incompatible format detected.
&amp;nbsp;
You are trying to write to `s3://nbu-ml/projects/rca/msft/dsm09collectx/delta` using Databricks Delta, but there is no
transaction log present. Check the upstream job to make sure that it is writing
using format("delta") and that you are trying to write to the table base path.
&amp;nbsp;
To disable this check, SET spark.databricks.delta.formatCheck.enabled=false
To learn more about Delta, see &lt;A href="https://docs.databricks.com/delta/index.html" target="test_blank"&gt;https://docs.databricks.com/delta/index.html&lt;/A&gt;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 18 Jan 2023 10:22:03 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/autoloader-documentation-does-not-work/m-p/11435#M6414</guid>
      <dc:creator>chanansh</dc:creator>
      <dc:date>2023-01-18T10:22:03Z</dc:date>
    </item>
    <item>
      <title>Re: autoloader documentation does not work</title>
      <link>https://community.databricks.com/t5/data-engineering/autoloader-documentation-does-not-work/m-p/11436#M6415</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;It seems like you are writing to a path which is not empty and has some non - delta format files. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Also, can you confirm if the path mentioned in the error message "`s3://nbu-ml/projects/rca/msft/dsm09collectx/delta` " is the path you are writing to or reading from? I faced a similar error - but that was when I read a delta table path through :   .option("cloudFiles.format", "parquet"). I overcame the error by adding spark.databricks.delta.formatCheck.enabled=false in spark config.&lt;/P&gt;</description>
      <pubDate>Wed, 08 Feb 2023 20:38:52 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/autoloader-documentation-does-not-work/m-p/11436#M6415</guid>
      <dc:creator>Murthy1</dc:creator>
      <dc:date>2023-02-08T20:38:52Z</dc:date>
    </item>
  </channel>
</rss>

