<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Problem with Autoloader, S3, and wildcard in Machine Learning</title>
    <link>https://community.databricks.com/t5/machine-learning/problem-with-autoloader-s3-and-wildcard/m-p/22461#M1268</link>
    <description>&lt;P&gt;The error was more related to a lot of stuff on the AWS side, so we deleted and cleared the SQS and SNS. we also used the CloudFilesAWSResourceManager&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;val manager = CloudFilesAWSResourceManager
    .newManager
    .option("path", filePath)
    .create()
   manager.setUpNotificationServices(notificationServices)&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;</description>
    <pubDate>Wed, 16 Nov 2022 15:43:14 GMT</pubDate>
    <dc:creator>Raymond_Garcia</dc:creator>
    <dc:date>2022-11-16T15:43:14Z</dc:date>
    <item>
      <title>Problem with Autoloader, S3, and wildcard</title>
      <link>https://community.databricks.com/t5/machine-learning/problem-with-autoloader-s3-and-wildcard/m-p/22460#M1267</link>
      <description>&lt;P&gt;&lt;/P&gt;&lt;P&gt;Hello, I have an autoloader code and it is pretty standard, we have this variable file path that points to an S3 bucket. example #2 executed successfully and example 1 throws an exception.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;it seems like source 1 always throws an exception whereas source 2 works but it throws an exception when I used a more generic path like ???-??/??-??&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;If anybody has a clue how to solve this issue it will be helpful, thanks in advance!&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;example 1: val file_path = "/mnt/output/raw/source1/????-??/??-??/*.e.ndjson"&lt;/P&gt;&lt;P&gt;example 2: val file_path = "/mnt/output/raw/source2/2022-11/14-??/*.e.ndjson"&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;com.amazonaws.services.s3.model.AmazonS3Exception: Unable to validate the following destination configurations (Service: Amazon S3; Status Code: 400; Error Code: InvalidArgument;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;or&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;Configuration is ambiguously defined. Cannot have overlapping suffixes in two rules if the prefixes are overlapping for the same event type. (Service: Amazon S3; Status Code: 400; Error Code: InvalidArgument;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;val reader = spark.readStream
  .format("cloudFiles")
  .option("cloudFiles.format", "text")
  .option("cloudFiles.schemaLocation", checkpoint_path)
  .option("cloudFiles.useNotifications", true)
  .load(file_path)
  .selectExpr(s"value")
  .writeStream
  .format("kafka")
  .option("kafka.bootstrap.servers", "kafka:9092")
  .option("topic", "test_topic_3")
  .option("checkpointLocation", checkpoint_path)
  .trigger(Trigger.AvailableNow)
  .start()&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 14 Nov 2022 17:19:02 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/problem-with-autoloader-s3-and-wildcard/m-p/22460#M1267</guid>
      <dc:creator>Raymond_Garcia</dc:creator>
      <dc:date>2022-11-14T17:19:02Z</dc:date>
    </item>
    <item>
      <title>Re: Problem with Autoloader, S3, and wildcard</title>
      <link>https://community.databricks.com/t5/machine-learning/problem-with-autoloader-s3-and-wildcard/m-p/22461#M1268</link>
      <description>&lt;P&gt;The error was more related to a lot of stuff on the AWS side, so we deleted and cleared the SQS and SNS. we also used the CloudFilesAWSResourceManager&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;val manager = CloudFilesAWSResourceManager
    .newManager
    .option("path", filePath)
    .create()
   manager.setUpNotificationServices(notificationServices)&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 16 Nov 2022 15:43:14 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/problem-with-autoloader-s3-and-wildcard/m-p/22461#M1268</guid>
      <dc:creator>Raymond_Garcia</dc:creator>
      <dc:date>2022-11-16T15:43:14Z</dc:date>
    </item>
  </channel>
</rss>

