<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Can I get more details on the performance differences between pyodbc and SQL Connector for Pytho in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/can-i-get-more-details-on-the-performance-differences-between/m-p/129890#M48621</link>
    <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/181042"&gt;@Travis84&lt;/a&gt;,&lt;/P&gt;&lt;P&gt;&amp;nbsp;Hi,&lt;BR /&gt;I came across an article that might help you, which makes the following comparison:&lt;BR /&gt;A blog on high-bandwidth connections using Databricks’ Cloud Fetch optimization (leveraging parallel data transfer via pre-signed URLs) reported up to 12× faster extract throughput for very large datasets (~3.4 GB) when using ODBC-based tooling.&lt;BR /&gt;&lt;A href="https://www.databricks.com/blog/2021/08/11/how-we-achieved-high-bandwidth-connectivity-with-bi-tools.html" target="_blank"&gt;https://www.databricks.com/blog/2021/08/11/how-we-achieved-high-bandwidth-connectivity-with-bi-tools.html&lt;/A&gt;&lt;/P&gt;</description>
    <pubDate>Wed, 27 Aug 2025 11:26:46 GMT</pubDate>
    <dc:creator>WiliamRosa</dc:creator>
    <dc:date>2025-08-27T11:26:46Z</dc:date>
    <item>
      <title>Can I get more details on the performance differences between pyodbc and SQL Connector for Python?</title>
      <link>https://community.databricks.com/t5/data-engineering/can-i-get-more-details-on-the-performance-differences-between/m-p/129753#M48598</link>
      <description>&lt;P&gt;This article (&lt;A href="https://docs.databricks.com/aws/en/dev-tools/pyodbc" target="_blank"&gt;Connect Python and pyodbc to Databricks | Databricks on AWS&lt;/A&gt;) states the following&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;"However pyodbc&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;may have better performance when fetching queries results above 10 MB."&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;This is a bit vague. The word "may" implies "maybe not". Also, "better performance" is not quantitative. How much better? Are there any benchmarking studies? I have not been able to find out more information.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 26 Aug 2025 07:24:18 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/can-i-get-more-details-on-the-performance-differences-between/m-p/129753#M48598</guid>
      <dc:creator>Travis84</dc:creator>
      <dc:date>2025-08-26T07:24:18Z</dc:date>
    </item>
    <item>
      <title>Re: Can I get more details on the performance differences between pyodbc and SQL Connector for Pytho</title>
      <link>https://community.databricks.com/t5/data-engineering/can-i-get-more-details-on-the-performance-differences-between/m-p/129822#M48612</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/181042"&gt;@Travis84&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;The documentation sounds vague but real-world performance depends on many factors. For clarity, the best approach is to run a simple test by executing the same large query with both pyodbc and the Databricks SQL Connector in your environment.&lt;/P&gt;</description>
      <pubDate>Tue, 26 Aug 2025 14:00:59 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/can-i-get-more-details-on-the-performance-differences-between/m-p/129822#M48612</guid>
      <dc:creator>SP_6721</dc:creator>
      <dc:date>2025-08-26T14:00:59Z</dc:date>
    </item>
    <item>
      <title>Re: Can I get more details on the performance differences between pyodbc and SQL Connector for Pytho</title>
      <link>https://community.databricks.com/t5/data-engineering/can-i-get-more-details-on-the-performance-differences-between/m-p/129868#M48617</link>
      <description>&lt;P&gt;I can't give you a comparison but at least in my case, &lt;STRONG&gt;Spark SQL Connector&lt;/STRONG&gt; over SQL Server is behaving pretty fine when retrieving moderate amount of rows from SQL tables. As said in previous commands, it depends on multiple factors, not only driver but database design as well. In any case, I started using that connector as needed a OLTP system integrated with Databricks Lakehouse. So, I created a set of functions to interact with SQL Server and at least until now, good performance. With new addition of &lt;STRONG&gt;Databricks Lakebase&lt;/STRONG&gt; over PostgreSQL, maybe I need to upgrade...opssss Saying this because perhaps this new feature can be a good fit for you as well.&lt;/P&gt;</description>
      <pubDate>Wed, 27 Aug 2025 08:57:31 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/can-i-get-more-details-on-the-performance-differences-between/m-p/129868#M48617</guid>
      <dc:creator>Coffee77</dc:creator>
      <dc:date>2025-08-27T08:57:31Z</dc:date>
    </item>
    <item>
      <title>Re: Can I get more details on the performance differences between pyodbc and SQL Connector for Pytho</title>
      <link>https://community.databricks.com/t5/data-engineering/can-i-get-more-details-on-the-performance-differences-between/m-p/129890#M48621</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/181042"&gt;@Travis84&lt;/a&gt;,&lt;/P&gt;&lt;P&gt;&amp;nbsp;Hi,&lt;BR /&gt;I came across an article that might help you, which makes the following comparison:&lt;BR /&gt;A blog on high-bandwidth connections using Databricks’ Cloud Fetch optimization (leveraging parallel data transfer via pre-signed URLs) reported up to 12× faster extract throughput for very large datasets (~3.4 GB) when using ODBC-based tooling.&lt;BR /&gt;&lt;A href="https://www.databricks.com/blog/2021/08/11/how-we-achieved-high-bandwidth-connectivity-with-bi-tools.html" target="_blank"&gt;https://www.databricks.com/blog/2021/08/11/how-we-achieved-high-bandwidth-connectivity-with-bi-tools.html&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 27 Aug 2025 11:26:46 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/can-i-get-more-details-on-the-performance-differences-between/m-p/129890#M48621</guid>
      <dc:creator>WiliamRosa</dc:creator>
      <dc:date>2025-08-27T11:26:46Z</dc:date>
    </item>
    <item>
      <title>Re: Can I get more details on the performance differences between pyodbc and SQL Connector for Pytho</title>
      <link>https://community.databricks.com/t5/data-engineering/can-i-get-more-details-on-the-performance-differences-between/m-p/129893#M48624</link>
      <description>&lt;P&gt;Interesting article but it seems that Cloud Fetch is supported on both odbc and sql connector&lt;/P&gt;&lt;P&gt;&lt;A href="https://docs.databricks.com/aws/en/integrations/odbc/capability#cloud-fetch-in-odbc" target="_blank"&gt;Driver capability settings for the Databricks ODBC Driver | Databricks on AWS&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://docs.databricks.com/aws/en/dev-tools/python-sql-connector#get-started" target="_blank"&gt;Databricks SQL Connector for Python | Databricks on AWS&lt;/A&gt;&amp;nbsp;(see section getting started)&lt;/P&gt;</description>
      <pubDate>Wed, 27 Aug 2025 11:59:31 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/can-i-get-more-details-on-the-performance-differences-between/m-p/129893#M48624</guid>
      <dc:creator>Travis84</dc:creator>
      <dc:date>2025-08-27T11:59:31Z</dc:date>
    </item>
  </channel>
</rss>

