<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Substantial performance issues/degradation on Databricks when migrating job over to EMR in Machine Learning</title>
    <link>https://community.databricks.com/t5/machine-learning/substantial-performance-issues-degradation-on-databricks-when/m-p/31630#M1676</link>
    <description>&lt;P&gt;&lt;B&gt;Versions of Code:&lt;/B&gt;&lt;/P&gt;&lt;P&gt;Databricks: 7.3 LTS ML (includes Apache Spark 3.0.1, Scala 2.12)&lt;/P&gt;&lt;P&gt;AWS EMR: 6.1.0 (Spark 3.0.0, Scala 2.12)&lt;/P&gt;&lt;P&gt;&lt;A href="https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-610-release.html" target="test_blank"&gt;https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-610-release.html&lt;/A&gt; &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;B&gt;The problem:&lt;/B&gt;&lt;/P&gt;&lt;P&gt;Errors in Databricks when replicating job that works in AWS EMR&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;B&gt;Description and Setup:&lt;/B&gt;&lt;/P&gt;&lt;P&gt;We have spark job that essentially runs the &lt;/P&gt;&lt;P&gt;```ALSModel.recommendForAllUsers( recommendations_ct)```&lt;/P&gt;&lt;P&gt;function and writes it AWS S3 in AWS EMR.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;We are currently attempting to migrate this to our Databricks environment. We have copied over the exact same cluster configuration, and spark configuration values and the python code is identical as well.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The configuration that can execute in EMR and fails in Databricks:&lt;/P&gt;&lt;P&gt;6 r5.8xlarge Workers (256GB, 32 cores)&lt;/P&gt;&lt;P&gt;1 r5.2xlarge Driver (64GB, 8 cores)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;with spark configuration values:&lt;/P&gt;&lt;P&gt;```&lt;/P&gt;&lt;P&gt;spark.serializer org.apache.spark.serializer.KryoSerializer&lt;/P&gt;&lt;P&gt;spark.kryoserializer.buffer.max 2000m&lt;/P&gt;&lt;P&gt;spark.driver.memoryOverhead 4096&lt;/P&gt;&lt;P&gt;spark.executor.cores 5&lt;/P&gt;&lt;P&gt;spark.executor.memory 35G&lt;/P&gt;&lt;P&gt;spark.driver.cores 5&lt;/P&gt;&lt;P&gt;spark.executor.memoryOverhead 4096&lt;/P&gt;&lt;P&gt;spark.sql.shuffle.partitions 350&lt;/P&gt;&lt;P&gt;spark.broadcast.blockSize 12m&lt;/P&gt;&lt;P&gt;spark.executor.instances 35&lt;/P&gt;&lt;P&gt;spark.driver.memory 35G&lt;/P&gt;&lt;P&gt;spark.default.parallelism 350&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;fs.s3a.server-side-encryption-algorithm SSE-KMS&lt;/P&gt;&lt;P&gt;spark.hadoop.fs.s3a.stsAssumeRole.arn arn:aws:iam::***REDACTED***:role/databricks-s3-egress&lt;/P&gt;&lt;P&gt;spark.hadoop.fs.s3a.acl.default BucketOwnerFullControl&lt;/P&gt;&lt;P&gt;spark.hadoop.fs.s3a.credentialsType AssumeRole&lt;/P&gt;&lt;P&gt;```&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;B&gt;The error we observe:&lt;/B&gt;&lt;/P&gt;&lt;P&gt;Errors consistent with lost executors due to OOM and other JVM issues. What's strange is this runs comfortably within EMR.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;ERROR RetryingBlockFetcher: Exception while beginning fetch of 1 outstanding blocks &lt;/P&gt;&lt;P&gt;java.io.IOException: Failed to connect to /***REDACTED***&lt;/P&gt;&lt;P&gt;	at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:253)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:195)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.network.netty.NettyBlockTransferService$$anon$2.createAndStart(NettyBlockTransferService.scala:122)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.network.shuffle.RetryingBlockFetcher.fetchAllOutstanding(RetryingBlockFetcher.java:141)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.network.shuffle.RetryingBlockFetcher.start(RetryingBlockFetcher.java:121)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.network.netty.NettyBlockTransferService.fetchBlocks(NettyBlockTransferService.scala:143)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.network.BlockTransferService.fetchBlockSync(BlockTransferService.scala:103)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.storage.BlockManager.fetchRemoteManagedBuffer(BlockManager.scala:1011)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.storage.BlockManager.$anonfun$getRemoteBlock$8(BlockManager.scala:955)&lt;/P&gt;&lt;P&gt;	at scala.Option.orElse(Option.scala:447)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.storage.BlockManager.getRemoteBlock(BlockManager.scala:955)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.storage.BlockManager.getRemoteBytes(BlockManager.scala:1093)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.broadcast.TorrentBroadcast.$anonfun$readBlocks$1(TorrentBroadcast.scala:195)&lt;/P&gt;&lt;P&gt;	at scala.runtime.java8.JFunction1$mcVI$sp.apply(JFunction1$mcVI$sp.java:23)&lt;/P&gt;&lt;P&gt;	at scala.collection.immutable.List.foreach(List.scala:392)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.broadcast.TorrentBroadcast.readBlocks(TorrentBroadcast.scala:184)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.broadcast.TorrentBroadcast.$anonfun$readBroadcastBlock$4(TorrentBroadcast.scala:268)&lt;/P&gt;&lt;P&gt;	at scala.Option.getOrElse(Option.scala:189)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.broadcast.TorrentBroadcast.$anonfun$readBroadcastBlock$2(TorrentBroadcast.scala:246)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.util.KeyLock.withLock(KeyLock.scala:64)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.broadcast.TorrentBroadcast.$anonfun$readBroadcastBlock$1(TorrentBroadcast.scala:241)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1558)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:241)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:118)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:78)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat.$anonfun$buildReaderWithPartitionValues$1(ParquetFileFormat.scala:309)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1$$anon$2.getNext(FileScanRDD.scala:291)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.$anonfun$prepareNextFile$1(FileScanRDD.scala:499)&lt;/P&gt;&lt;P&gt;	at scala.concurrent.Future$.$anonfun$apply$1(Future.scala:659)&lt;/P&gt;&lt;P&gt;	at scala.util.Success.$anonfun$map$1(Try.scala:255)&lt;/P&gt;&lt;P&gt;	at scala.util.Success.map(Try.scala:213)&lt;/P&gt;&lt;P&gt;	at scala.concurrent.Future.$anonfun$map$1(Future.scala:292)&lt;/P&gt;&lt;P&gt;	at scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33)&lt;/P&gt;&lt;P&gt;	at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33)&lt;/P&gt;&lt;P&gt;	at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.util.threads.SparkThreadLocalCapturingRunnable.$anonfun$run$1(SparkThreadLocalForwardingThreadPoolExecutor.scala:104)&lt;/P&gt;&lt;P&gt;	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.util.threads.SparkThreadLocalCapturingHelper.runWithCaptured(SparkThreadLocalForwardingThreadPoolExecutor.scala:68)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.util.threads.SparkThreadLocalCapturingHelper.runWithCaptured$(SparkThreadLocalForwardingThreadPoolExecutor.scala:54)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.util.threads.SparkThreadLocalCapturingRunnable.runWithCaptured(SparkThreadLocalForwardingThreadPoolExecutor.scala:101)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.util.threads.SparkThreadLocalCapturingRunnable.run(SparkThreadLocalForwardingThreadPoolExecutor.scala:104)&lt;/P&gt;&lt;P&gt;	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)&lt;/P&gt;&lt;P&gt;	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)&lt;/P&gt;&lt;P&gt;	at java.lang.Thread.run(Thread.java:748)&lt;/P&gt;&lt;P&gt;Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: /10.203.234.49:34347&lt;/P&gt;&lt;P&gt;Caused by: java.net.ConnectException: Connection refused&lt;/P&gt;&lt;P&gt;	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)&lt;/P&gt;&lt;P&gt;	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716)&lt;/P&gt;&lt;P&gt;	at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330)&lt;/P&gt;&lt;P&gt;	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)&lt;/P&gt;&lt;P&gt;	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:702)&lt;/P&gt;&lt;P&gt;	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)&lt;/P&gt;&lt;P&gt;	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)&lt;/P&gt;&lt;P&gt;	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)&lt;/P&gt;&lt;P&gt;	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)&lt;/P&gt;&lt;P&gt;	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)&lt;/P&gt;&lt;P&gt;	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)&lt;/P&gt;&lt;P&gt;	at java.lang.Thread.run(Thread.java:748)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Any help/thoughts would be greatly appreciated.&lt;/P&gt;</description>
    <pubDate>Thu, 15 Sep 2022 19:48:32 GMT</pubDate>
    <dc:creator>643926</dc:creator>
    <dc:date>2022-09-15T19:48:32Z</dc:date>
    <item>
      <title>Substantial performance issues/degradation on Databricks when migrating job over to EMR</title>
      <link>https://community.databricks.com/t5/machine-learning/substantial-performance-issues-degradation-on-databricks-when/m-p/31630#M1676</link>
      <description>&lt;P&gt;&lt;B&gt;Versions of Code:&lt;/B&gt;&lt;/P&gt;&lt;P&gt;Databricks: 7.3 LTS ML (includes Apache Spark 3.0.1, Scala 2.12)&lt;/P&gt;&lt;P&gt;AWS EMR: 6.1.0 (Spark 3.0.0, Scala 2.12)&lt;/P&gt;&lt;P&gt;&lt;A href="https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-610-release.html" target="test_blank"&gt;https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-610-release.html&lt;/A&gt; &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;B&gt;The problem:&lt;/B&gt;&lt;/P&gt;&lt;P&gt;Errors in Databricks when replicating job that works in AWS EMR&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;B&gt;Description and Setup:&lt;/B&gt;&lt;/P&gt;&lt;P&gt;We have spark job that essentially runs the &lt;/P&gt;&lt;P&gt;```ALSModel.recommendForAllUsers( recommendations_ct)```&lt;/P&gt;&lt;P&gt;function and writes it AWS S3 in AWS EMR.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;We are currently attempting to migrate this to our Databricks environment. We have copied over the exact same cluster configuration, and spark configuration values and the python code is identical as well.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The configuration that can execute in EMR and fails in Databricks:&lt;/P&gt;&lt;P&gt;6 r5.8xlarge Workers (256GB, 32 cores)&lt;/P&gt;&lt;P&gt;1 r5.2xlarge Driver (64GB, 8 cores)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;with spark configuration values:&lt;/P&gt;&lt;P&gt;```&lt;/P&gt;&lt;P&gt;spark.serializer org.apache.spark.serializer.KryoSerializer&lt;/P&gt;&lt;P&gt;spark.kryoserializer.buffer.max 2000m&lt;/P&gt;&lt;P&gt;spark.driver.memoryOverhead 4096&lt;/P&gt;&lt;P&gt;spark.executor.cores 5&lt;/P&gt;&lt;P&gt;spark.executor.memory 35G&lt;/P&gt;&lt;P&gt;spark.driver.cores 5&lt;/P&gt;&lt;P&gt;spark.executor.memoryOverhead 4096&lt;/P&gt;&lt;P&gt;spark.sql.shuffle.partitions 350&lt;/P&gt;&lt;P&gt;spark.broadcast.blockSize 12m&lt;/P&gt;&lt;P&gt;spark.executor.instances 35&lt;/P&gt;&lt;P&gt;spark.driver.memory 35G&lt;/P&gt;&lt;P&gt;spark.default.parallelism 350&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;fs.s3a.server-side-encryption-algorithm SSE-KMS&lt;/P&gt;&lt;P&gt;spark.hadoop.fs.s3a.stsAssumeRole.arn arn:aws:iam::***REDACTED***:role/databricks-s3-egress&lt;/P&gt;&lt;P&gt;spark.hadoop.fs.s3a.acl.default BucketOwnerFullControl&lt;/P&gt;&lt;P&gt;spark.hadoop.fs.s3a.credentialsType AssumeRole&lt;/P&gt;&lt;P&gt;```&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;B&gt;The error we observe:&lt;/B&gt;&lt;/P&gt;&lt;P&gt;Errors consistent with lost executors due to OOM and other JVM issues. What's strange is this runs comfortably within EMR.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;ERROR RetryingBlockFetcher: Exception while beginning fetch of 1 outstanding blocks &lt;/P&gt;&lt;P&gt;java.io.IOException: Failed to connect to /***REDACTED***&lt;/P&gt;&lt;P&gt;	at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:253)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:195)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.network.netty.NettyBlockTransferService$$anon$2.createAndStart(NettyBlockTransferService.scala:122)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.network.shuffle.RetryingBlockFetcher.fetchAllOutstanding(RetryingBlockFetcher.java:141)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.network.shuffle.RetryingBlockFetcher.start(RetryingBlockFetcher.java:121)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.network.netty.NettyBlockTransferService.fetchBlocks(NettyBlockTransferService.scala:143)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.network.BlockTransferService.fetchBlockSync(BlockTransferService.scala:103)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.storage.BlockManager.fetchRemoteManagedBuffer(BlockManager.scala:1011)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.storage.BlockManager.$anonfun$getRemoteBlock$8(BlockManager.scala:955)&lt;/P&gt;&lt;P&gt;	at scala.Option.orElse(Option.scala:447)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.storage.BlockManager.getRemoteBlock(BlockManager.scala:955)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.storage.BlockManager.getRemoteBytes(BlockManager.scala:1093)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.broadcast.TorrentBroadcast.$anonfun$readBlocks$1(TorrentBroadcast.scala:195)&lt;/P&gt;&lt;P&gt;	at scala.runtime.java8.JFunction1$mcVI$sp.apply(JFunction1$mcVI$sp.java:23)&lt;/P&gt;&lt;P&gt;	at scala.collection.immutable.List.foreach(List.scala:392)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.broadcast.TorrentBroadcast.readBlocks(TorrentBroadcast.scala:184)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.broadcast.TorrentBroadcast.$anonfun$readBroadcastBlock$4(TorrentBroadcast.scala:268)&lt;/P&gt;&lt;P&gt;	at scala.Option.getOrElse(Option.scala:189)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.broadcast.TorrentBroadcast.$anonfun$readBroadcastBlock$2(TorrentBroadcast.scala:246)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.util.KeyLock.withLock(KeyLock.scala:64)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.broadcast.TorrentBroadcast.$anonfun$readBroadcastBlock$1(TorrentBroadcast.scala:241)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1558)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:241)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:118)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:78)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat.$anonfun$buildReaderWithPartitionValues$1(ParquetFileFormat.scala:309)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1$$anon$2.getNext(FileScanRDD.scala:291)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.$anonfun$prepareNextFile$1(FileScanRDD.scala:499)&lt;/P&gt;&lt;P&gt;	at scala.concurrent.Future$.$anonfun$apply$1(Future.scala:659)&lt;/P&gt;&lt;P&gt;	at scala.util.Success.$anonfun$map$1(Try.scala:255)&lt;/P&gt;&lt;P&gt;	at scala.util.Success.map(Try.scala:213)&lt;/P&gt;&lt;P&gt;	at scala.concurrent.Future.$anonfun$map$1(Future.scala:292)&lt;/P&gt;&lt;P&gt;	at scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33)&lt;/P&gt;&lt;P&gt;	at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33)&lt;/P&gt;&lt;P&gt;	at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.util.threads.SparkThreadLocalCapturingRunnable.$anonfun$run$1(SparkThreadLocalForwardingThreadPoolExecutor.scala:104)&lt;/P&gt;&lt;P&gt;	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.util.threads.SparkThreadLocalCapturingHelper.runWithCaptured(SparkThreadLocalForwardingThreadPoolExecutor.scala:68)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.util.threads.SparkThreadLocalCapturingHelper.runWithCaptured$(SparkThreadLocalForwardingThreadPoolExecutor.scala:54)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.util.threads.SparkThreadLocalCapturingRunnable.runWithCaptured(SparkThreadLocalForwardingThreadPoolExecutor.scala:101)&lt;/P&gt;&lt;P&gt;	at org.apache.spark.util.threads.SparkThreadLocalCapturingRunnable.run(SparkThreadLocalForwardingThreadPoolExecutor.scala:104)&lt;/P&gt;&lt;P&gt;	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)&lt;/P&gt;&lt;P&gt;	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)&lt;/P&gt;&lt;P&gt;	at java.lang.Thread.run(Thread.java:748)&lt;/P&gt;&lt;P&gt;Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: /10.203.234.49:34347&lt;/P&gt;&lt;P&gt;Caused by: java.net.ConnectException: Connection refused&lt;/P&gt;&lt;P&gt;	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)&lt;/P&gt;&lt;P&gt;	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716)&lt;/P&gt;&lt;P&gt;	at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330)&lt;/P&gt;&lt;P&gt;	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)&lt;/P&gt;&lt;P&gt;	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:702)&lt;/P&gt;&lt;P&gt;	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)&lt;/P&gt;&lt;P&gt;	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)&lt;/P&gt;&lt;P&gt;	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)&lt;/P&gt;&lt;P&gt;	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)&lt;/P&gt;&lt;P&gt;	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)&lt;/P&gt;&lt;P&gt;	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)&lt;/P&gt;&lt;P&gt;	at java.lang.Thread.run(Thread.java:748)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Any help/thoughts would be greatly appreciated.&lt;/P&gt;</description>
      <pubDate>Thu, 15 Sep 2022 19:48:32 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/substantial-performance-issues-degradation-on-databricks-when/m-p/31630#M1676</guid>
      <dc:creator>643926</dc:creator>
      <dc:date>2022-09-15T19:48:32Z</dc:date>
    </item>
  </channel>
</rss>

