Databricks Community

timo82 · ‎10-07-2025

Hello,

after databricks update the Runtime from Release: 15.4.24 to Release: 15.4.25 we getting in all jobs the Error:

[CANNOT_OPEN_SOCKET] Can not open socket: ["tried to connect to ('127.0.0.1', 45287)

What we can do here?

Greetings

HariSankar · ‎10-07-2025

yes, exactly, Changing from 15.4.x-scala2.12 to 15.4.24-scala2.12 will pin your cluster to the 15.4.24 patch and prevent it from auto-upgrading to the problematic 15.4.25 version.

harisankar

View solution in original post

Advika · ‎10-07-2025

Hello @timo82!

Can you try adding 'spark.databricks.pyspark.useFileBasedCollect': 'true' to your Spark config?

HariSankar · ‎10-07-2025

Hey @timo82,

This error indicates Python workers cannot communicate with the JVM after the maintenance update. Since it's affecting all jobs after upgrading to 15.4.25.

try these steps:

--> Completely restart the cluster (stop then start, not just restart) to reinitialize socket listeners
--> Check init scripts, Temporarily remove any cluster init scripts and test if jobs succeed without them, as maintenance updates can introduce incompatibilities
--> Review Spark configurations - Check driver logs for deprecated or conflicting Spark configs that may have changed between 15.4.24 and 15.4.25

Code workarounds:
--> Add warmup operations, Insert a simple operation like df.limit(1).collect() at the start of your jobs before the main processing to establish the connection
--> Implement retry logic, Wrap initial Spark actions in try-catch blocks, as socket errors can be transient during startup

The code workarounds help address the timing and initialization issues that cause the socket error between Python workers and the JVM.

If still failing:
--> Check cluster access mode,Verify you're using the appropriate access mode (Shared or Single User) for your workload
--> Increase cluster resources, Scale up memory if errors are intermittent under load
--> Roll back to 15.4.24, If blocking production, temporarily revert while investigating further
--> Contact Databricks support, Since this affects all jobs after a maintenance update, there may be a regression in 15.4.25

harisankar

timo82 · ‎10-07-2025

Thx for your details.

How we can roll back to 15.4.24?

We config the cluster type only at an yaml, not the running time version.

job_clusters:
- job_cluster_key: default
new_cluster:
spark_version: 15.4.x-scala2.12
node_type_id: Standard_D64s_v3
autoscale:
min_workers: 1
max_workers: 5
enable_elastic_disk: true
data_security_mode: SINGLE_USER
spark_conf:
spark.databricks.pip.ignoreSSL: true
spark.sql.inMemoryColumnarStorage.compressed: true
spark.sql.adaptive.enabled: true
spark.sql.adaptive.coalescePartitions.enabled: true
spark.databricks.delta.schema.autoMerge.enabled: true
spark.databricks.adaptive.autoOptimizeShuffle.enabled: true
spark.executor.heartbeatInterval: 300000
spark.network.timeout: 320000
spark.sql.codegen: true

Greetings

timo82 · ‎10-07-2025

spark_version: 15.4.x-scala2.12

to

spark_version: 15.4.24-scala2.12

Correct?

HariSankar · ‎10-07-2025

yes, exactly, Changing from 15.4.x-scala2.12 to 15.4.24-scala2.12 will pin your cluster to the 15.4.24 patch and prevent it from auto-upgrading to the problematic 15.4.25 version.

harisankar

Hansjoerg · ‎10-08-2025

@HariSankar
Using Bundles doesn't seem to allow to provide a fixed patch version:

Error: cannot update job: INVALID_PARAMETER_VALUE: Invalid spark version 15.4.24-scala2.12.
  with databricks_job.pdv-partnerbul-dbxservice-housekeeping,
  on bundle.tf

HariSankar · ‎10-08-2025

Hi @Hansjoerg,

Apologies for the confusion earlier. You are right Bundles doesn't allow pinning to specific patch versions like 15.4.24.

Your best option is to skip Bundles for now and use the regular Databricks Jobs setup (via UI or Jobs API) where you can specify exactly 15.4.24-scala2.12
to avoid the broken 15.4.25 version.

This will let you roll back to the working version while Databricks fixes the socket issue in 15.4.25.

harisankar