cancel
Showing results for 
Search instead for 
Did you mean: 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 

Databricks Connect: Enabling Arrow on Serverless Compute

kunalmishra9
New Contributor III

I recently upgraded my Databricks Connect version to 15.4 and got set up for Serverless, but ran into the following error when I ran the standard code to enable Arrow on Pyspark:

 

>>> spark.conf.set(key='spark.sql.execution.arrow.pyspark.enabled', value='true')
pyspark.errors.exceptions.connect.AnalysisException: [CONFIG_NOT_AVAILABLE] Configuration spark.sql.execution.arrow.pyspark.enabled is not available. SQLSTATE: 42K0I
JVM stacktrace:
org.apache.spark.sql.AnalysisException
	at com.databricks.sql.connect.SparkConnectConfig$.assertConfigAllowed(SparkConnectConfig.scala:219)
	at org.apache.spark.sql.connect.service.SparkConnectConfigHandler$RuntimeConfigWrapper.set(SparkConnectConfigHandler.scala:88)
	at org.apache.spark.sql.connect.service.SparkConnectConfigHandler.$anonfun$handleSet$1(SparkConnectConfigHandler.scala:230)
	at org.apache.spark.sql.connect.service.SparkConnectConfigHandler.$anonfun$handleSet$1$adapted(SparkConnectConfigHandler.scala:228)
	at scala.collection.Iterator.foreach(Iterator.scala:943)
	at scala.collection.Iterator.foreach$(Iterator.scala:943)
	at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
	at org.apache.spark.sql.connect.service.SparkConnectConfigHandler.handleSet(SparkConnectConfigHandler.scala:228)
	at org.apache.spark.sql.connect.service.SparkConnectConfigHandler.handle(SparkConnectConfigHandler.scala:201)
	at org.apache.spark.sql.connect.service.SparkConnectService.config(SparkConnectService.scala:123)
	at org.apache.spark.connect.proto.SparkConnectServiceGrpc$MethodHandlers.invoke(SparkConnectServiceGrpc.java:805)
	at grpc_shaded.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:182)
	at com.databricks.spark.connect.service.AuthenticationInterceptor$AuthenticatedServerCallListener.$anonfun$onHalfClose$1(AuthenticationInterceptor.scala:310)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at com.databricks.unity.UCSEphemeralState$Handle.runWith(UCSEphemeralState.scala:51)
	at com.databricks.unity.HandleImpl.runWith(UCSHandle.scala:104)
	at com.databricks.spark.connect.service.RequestContext.$anonfun$runWith$3(RequestContext.scala:286)
	at com.databricks.spark.connect.service.RequestContext$.com$databricks$spark$connect$service$RequestContext$$withLocalProperties(RequestContext.scala:473)
	at com.databricks.spark.connect.service.RequestContext.$anonfun$runWith$2(RequestContext.scala:286)
	at com.databricks.logging.AttributionContextTracing.$anonfun$withAttributionContext$1(AttributionContextTracing.scala:48)
	at com.databricks.logging.AttributionContext$.$anonfun$withValue$1(AttributionContext.scala:276)
	at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
	at com.databricks.logging.AttributionContext$.withValue(AttributionContext.scala:272)
	at com.databricks.logging.AttributionContextTracing.withAttributionContext(AttributionContextTracing.scala:46)
	at com.databricks.logging.AttributionContextTracing.withAttributionContext$(AttributionContextTracing.scala:43)
	at com.databricks.spark.util.PublicDBLogging.withAttributionContext(DatabricksSparkUsageLogger.scala:29)
	at com.databricks.spark.util.UniverseAttributionContextWrapper.withValue(AttributionContextUtils.scala:228)
	at com.databricks.spark.connect.service.RequestContext.$anonfun$runWith$1(RequestContext.scala:285)
	at com.databricks.spark.connect.service.RequestContext.withContext(RequestContext.scala:298)
	at com.databricks.spark.connect.service.RequestContext.runWith(RequestContext.scala:278)
	at com.databricks.spark.connect.service.AuthenticationInterceptor$AuthenticatedServerCallListener.onHalfClose(AuthenticationInterceptor.scala:310)
	at grpc_shaded.io.grpc.PartialForwardingServerCallListener.onHalfClose(PartialForwardingServerCallListener.java:35)
	at grpc_shaded.io.grpc.ForwardingServerCallListener.onHalfClose(ForwardingServerCallListener.java:23)
	at grpc_shaded.io.grpc.ForwardingServerCallListener$SimpleForwardingServerCallListener.onHalfClose(ForwardingServerCallListener.java:40)
	at grpc_shaded.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:351)
	at grpc_shaded.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:861)
	at grpc_shaded.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
	at grpc_shaded.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.lang.Thread.run(Thread.java:840)

When I disabled serverless and connected to a standard cluster, there was no error. So the ask is to either

1) fix enabling Arrow on PySpark

or

2) fix the error when the conf is specified, but possibly explicitly warn that it's not possible on Serverless Compute if this is intended. If not intended, this should be added to the linked documentation in both locations. 

For now, I've wrapped that bit of code in a try/except to gracefully handle the error no matter what I'm connecting to.

 

1 ACCEPTED SOLUTION

Accepted Solutions

Walter_C
Databricks Employee
Databricks Employee

Serverless is currently limited to only few spark confs as mentioned in docs:

View solution in original post

2 REPLIES 2

Walter_C
Databricks Employee
Databricks Employee

Serverless is currently limited to only few spark confs as mentioned in docs:

kunalmishra9
New Contributor III

Gotcha, thanks! Missed it in the limitations.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group