<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Read and saving Blob data from oracle to databricks S3 is slow in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/read-and-saving-blob-data-from-oracle-to-databricks-s3-is-slow/m-p/13176#M7890</link>
    <description>&lt;P&gt;I am trying to import a table from oracle which has around 1.3 mill rows and one of the column is a Blob, the total size of data on oracle is around 250+ GB. read and save to S3 as delta table is taking around 60 min. I tried with parallel(200 threads) read using JDBC. Still its taking more time.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Appreciate your valuable suggestions to speed up the process&lt;/P&gt;</description>
    <pubDate>Sat, 16 Oct 2021 09:16:34 GMT</pubDate>
    <dc:creator>RKNutalapati</dc:creator>
    <dc:date>2021-10-16T09:16:34Z</dc:date>
    <item>
      <title>Read and saving Blob data from oracle to databricks S3 is slow</title>
      <link>https://community.databricks.com/t5/data-engineering/read-and-saving-blob-data-from-oracle-to-databricks-s3-is-slow/m-p/13176#M7890</link>
      <description>&lt;P&gt;I am trying to import a table from oracle which has around 1.3 mill rows and one of the column is a Blob, the total size of data on oracle is around 250+ GB. read and save to S3 as delta table is taking around 60 min. I tried with parallel(200 threads) read using JDBC. Still its taking more time.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Appreciate your valuable suggestions to speed up the process&lt;/P&gt;</description>
      <pubDate>Sat, 16 Oct 2021 09:16:34 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/read-and-saving-blob-data-from-oracle-to-databricks-s3-is-slow/m-p/13176#M7890</guid>
      <dc:creator>RKNutalapati</dc:creator>
      <dc:date>2021-10-16T09:16:34Z</dc:date>
    </item>
    <item>
      <title>Re: Read and saving Blob data from oracle to databricks S3 is slow</title>
      <link>https://community.databricks.com/t5/data-engineering/read-and-saving-blob-data-from-oracle-to-databricks-s3-is-slow/m-p/13177#M7891</link>
      <description>&lt;P&gt;Hello, @Rama Krishna N​&amp;nbsp;- My name is Piper and I'm one of the community moderators. Thanks for your question. Let's give it a bit longer to see what the community says. Thank you for your patience.&lt;/P&gt;</description>
      <pubDate>Sun, 17 Oct 2021 18:56:00 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/read-and-saving-blob-data-from-oracle-to-databricks-s3-is-slow/m-p/13177#M7891</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2021-10-17T18:56:00Z</dc:date>
    </item>
    <item>
      <title>Re: Read and saving Blob data from oracle to databricks S3 is slow</title>
      <link>https://community.databricks.com/t5/data-engineering/read-and-saving-blob-data-from-oracle-to-databricks-s3-is-slow/m-p/13178#M7892</link>
      <description>&lt;P&gt;Can you check the parallel threads and confirm if the read or write operation is slower? Read operation slowness can be caused because of network issues or concurrency issues on the database.&lt;/P&gt;</description>
      <pubDate>Wed, 20 Oct 2021 08:04:26 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/read-and-saving-blob-data-from-oracle-to-databricks-s3-is-slow/m-p/13178#M7892</guid>
      <dc:creator>User16829050420</dc:creator>
      <dc:date>2021-10-20T08:04:26Z</dc:date>
    </item>
    <item>
      <title>Re: Read and saving Blob data from oracle to databricks S3 is slow</title>
      <link>https://community.databricks.com/t5/data-engineering/read-and-saving-blob-data-from-oracle-to-databricks-s3-is-slow/m-p/13179#M7893</link>
      <description>&lt;P&gt;Thanks @Ashwinkumar Jayakumar​&amp;nbsp;for reply. I had tried with dataFrame.count, it didn't take much time. Please suggest if there is any other best approach to check read operation slowness.&lt;/P&gt;</description>
      <pubDate>Wed, 20 Oct 2021 08:28:22 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/read-and-saving-blob-data-from-oracle-to-databricks-s3-is-slow/m-p/13179#M7893</guid>
      <dc:creator>RKNutalapati</dc:creator>
      <dc:date>2021-10-20T08:28:22Z</dc:date>
    </item>
    <item>
      <title>Re: Read and saving Blob data from oracle to databricks S3 is slow</title>
      <link>https://community.databricks.com/t5/data-engineering/read-and-saving-blob-data-from-oracle-to-databricks-s3-is-slow/m-p/13180#M7894</link>
      <description>&lt;P&gt;Hello @Rama Krishna N​&amp;nbsp;- We will need to check the task on the Spark UI to validate if the operation is a read from oracle database or write into S3. &lt;/P&gt;&lt;P&gt;The task should show the specific operation on the UI.&lt;/P&gt;&lt;P&gt;Also, the active threads on the Spark UI will show if the specific operation is a database operation.&lt;/P&gt;</description>
      <pubDate>Wed, 20 Oct 2021 08:36:09 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/read-and-saving-blob-data-from-oracle-to-databricks-s3-is-slow/m-p/13180#M7894</guid>
      <dc:creator>User16829050420</dc:creator>
      <dc:date>2021-10-20T08:36:09Z</dc:date>
    </item>
    <item>
      <title>Re: Read and saving Blob data from oracle to databricks S3 is slow</title>
      <link>https://community.databricks.com/t5/data-engineering/read-and-saving-blob-data-from-oracle-to-databricks-s3-is-slow/m-p/94302#M38862</link>
      <description>&lt;P&gt;Any update on this topic what should be the best option to read from oracle and write in ADLS.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 16 Oct 2024 14:21:05 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/read-and-saving-blob-data-from-oracle-to-databricks-s3-is-slow/m-p/94302#M38862</guid>
      <dc:creator>vinita_mehta</dc:creator>
      <dc:date>2024-10-16T14:21:05Z</dc:date>
    </item>
  </channel>
</rss>

