Databricks Community

Erik · ‎01-30-2022

I tried to benchmark the Powerbi Databricks connector vs the powerbi Delta Lake reader on a dataset of 2.15million rows. I found that the delta lake reader used 20 seconds, while importing through the SQL compute endpoint took ~75 seconds.

When I look at the query profile in SQL compute I see that 50 seconds are spendt in the "Columnar To Row" step. This makes me rather suspicios, since I got the impression that with an updated PowerBI we would take advantage of "cloud fetch" which creates files containing Apache Arrow batches, which is a columnar format. So why the conversion to rows? Maybe it is not actually using cloud fetch? Is there any way to verify that I am actually using cloud fetch? Either in PowerBi logs or in the Databricks SQL compute endpoint web interface?

cchalc · ‎06-23-2022

You would need to set EnableQueryResultsDownload Flag to 0 (zero) which will disable cloud fetch.

cchalc · ‎06-23-2022

So why is ColumnarToRow required?

pichlerpa · ‎10-26-2022

Hi everyone, check out my latest blog post to verify whether or not cloudfetch is actually used, maybe you also find some other optimizations there:

https://medium.com/creative-data/boosting-databricks-odbc-driver-be2cf08a7a4a?sk=bd814e0c3d6a9b32beb...

pulkitm · ‎02-27-2023

Guys, is there any way to switch off CloudFetch and fall back to ArrowResultSet by default irrespective of size? using the latest version of Spark Simba ODBC driver?

datadrivenangel · ‎01-23-2025

I'm troubleshooting slow speeds (~6Mbps) from Azure Databricks to the PowerBI Service (Fabric) via dataflows.

Drivers are up to date. ✅ PowerBI is using Microsoft's Spark ODBC driver Version 2.7.6.1014, confirmed via log4j.
HybridCloudStoreResultHandler is being used. ✅ Confirmed via log4j.
MapPartitionInternals is NOT using CloudStoreCollector. ⚠️ The Spark DAG ends with mapPartitionsInternal at HybridResultCollector.scala.

Does this mean that CloudFetch is not enabled here?

In the neo4j logs I also see CloudStoreBasedResultHandler receiving and responding to getNextCloudStoreBasedSet which I interpret as cloudFetch being enabled?