Databricks Community

Malthe · ‎09-11-2025

We've been using auto loader to ingest data from a storage account on Azure (format "cloudFiles").

Today, we're starting to see failures during the setup of event notification:

25/09/11 19:06:28 ERROR MicroBatchExecution: Non-interrupted exception thrown for queryId=[REDACTED],runId=[REDACTED]: org.json4s.MappingException: Do not know how to convert JArray(List(JString([REDACTED]))) into class java.lang.String
org.json4s.MappingException: Do not know how to convert JArray(List(JString([REDACTED]))) into class java.lang.String
	at org.json4s.reflect.package$.fail(package.scala:53)
	at org.json4s.Extraction$.convert(Extraction.scala:888)
	at org.json4s.Extraction$.$anonfun$extract$10(Extraction.scala:456)
	at org.json4s.Extraction$.$anonfun$customOrElse$1(Extraction.scala:780)
	at scala.PartialFunction.applyOrElse(PartialFunction.scala:127)
	at scala.PartialFunction.applyOrElse$(PartialFunction.scala:126)
	at scala.PartialFunction$$anon$1.applyOrElse(PartialFunction.scala:257)
	at org.json4s.Extraction$.customOrElse(Extraction.scala:780)
	at org.json4s.Extraction$.extract(Extraction.scala:454)
	at org.json4s.Extraction$.org$json4s$Extraction$$extractDetectingNonTerminal(Extraction.scala:482)
	at org.json4s.Extraction$.$anonfun$extract$8(Extraction.scala:426)
	at scala.collection.immutable.List.map(List.scala:297)
	at org.json4s.Extraction$.$anonfun$extract$7(Extraction.scala:424)
	at org.json4s.Extraction$.$anonfun$customOrElse$1(Extraction.scala:780)
	at scala.PartialFunction.applyOrElse(PartialFunction.scala:127)
	at scala.PartialFunction.applyOrElse$(PartialFunction.scala:126)
	at scala.PartialFunction$$anon$1.applyOrElse(PartialFunction.scala:257)
	at org.json4s.Extraction$.customOrElse(Extraction.scala:780)
	at org.json4s.Extraction$.extract(Extraction.scala:420)
	at org.json4s.Extraction$.extract(Extraction.scala:56)
	at org.json4s.ExtractableJsonAstNode.extract(ExtractableJsonAstNode.scala:22)
	at org.json4s.jackson.JacksonSerialization.read(Serialization.scala:62)
	at org.json4s.Serialization.read(Serialization.scala:31)
	at org.json4s.Serialization.read$(Serialization.scala:31)
	at org.json4s.jackson.JacksonSerialization.read(Serialization.scala:23)
	at com.databricks.sql.aqs.EventGridClient.generateAccessTokenUsingClientSecret(EventGridClient.scala:180)
	at com.databricks.sql.aqs.EventGridClient.generateAccessToken(EventGridClient.scala:238)
	at com.databricks.sql.aqs.autoIngest.AzureEventNotificationSetup$.getToken(AzureEventNotificationSetup.scala:345)
	at com.databricks.sql.aqs.autoIngest.AzureEventNotificationSetup$.$anonfun$buildStorageClient$2(AzureEventNotificationSetup.scala:387)
	at scala.Option.getOrElse(Option.scala:189)
	at com.databricks.sql.aqs.autoIngest.AzureEventNotificationSetup$.buildStorageClient(AzureEventNotificationSetup.scala:384)
	at com.databricks.sql.aqs.autoIngest.AzureEventNotificationSetup.<init>(AzureEventNotificationSetup.scala:70)
	at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77)
	at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:500)
	at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:481)
	at com.databricks.sql.fileNotification.autoIngest.EventNotificationSetup$.$anonfun$create$1(EventNotificationSetup.scala:68)
	at com.databricks.sql.fileNotification.autoIngest.ResourceManagementUtils$.unwrapInvocationTargetException(ResourceManagementUtils.scala:42)
	at com.databricks.sql.fileNotification.autoIngest.EventNotificationSetup$.create(EventNotificationSetup.scala:50)
	at com.databricks.sql.fileNotification.autoIngest.CloudFilesSourceProvider.$anonfun$createSource$2(CloudFilesSourceProvider.scala:143)
	at scala.Option.getOrElse(Option.scala:189)
	at com.databricks.sql.fileNotification.autoIngest.CloudFilesSourceProvider.createSource(CloudFilesSourceProvider.scala:128)
	at org.apache.spark.sql.execution.datasources.DataSource.createSource(DataSource.scala:346)
	at org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$2.$anonfun$applyOrElse$2(MicroBatchExecution.scala:223)
...

Saska · ‎09-13-2025

Falling back to file listing mode worked as band-aid, but dont see that as long term solution due to costs related to calling file listing operations (especialy with large number of files).

In practice - i removed the event grid related options from the streamreader:

cloudFiles.useNotifications
cloudFiles.resourceGroup
cloudFiles.subscriptionId
cloudFiles.clientId
cloudFiles.clientSecret
cloudFiles.tenantId

My top candidate would then be the Azure Storage API changes shared by @Malthe

View solution in original post

Khaja_Zaffer · ‎09-11-2025

Hello @Malthe

Thank you so much for sharing the error:
One of the error msg which drew my attention is :

.EventGridClient.generateAccessTokenUsingClientSecret

Can you please verify Service Principal Permissions:

Ensure your service principal has the minimum required Azure RBAC roles (these are not app roles in Azure AD; they are resource-level permissions):

RoleScopePurpose

Storage Blob Data Contributor	Storage account	Read/write blobs for file discovery.
Storage Queue Data Contributor	Storage account	Manage queues for notifications (if not using connection string).
EventGrid EventSubscription Contributor	Resource group (or subscription)	Create/read/delete Event Grid subscriptions.
Contributor	Storage account and resource group	General setup (broader; use if custom roles fail).

Assign these via Azure Portal > Storage Account/Resource Group > Access Control (IAM) > Add role assignment > Select service principal.

Also, remove unnecessary app role assignments (likely root cause)

Malthe · ‎09-11-2025

These are the current role assignments of this service principal:

Seems to be right and also:

This is just an intermittent error;
There's an event subscription on the storage queue (with a matching query id from the error message).

Could it be that somehow the Azure Management Endpoint for the event grid is returning a different kind of response all of a sudden? This is a traceback from Databricks' own integration code, so there isn't much to go on here.

Saska · ‎09-12-2025

Im having exactly the same issue - with multiple pipelines in 3 different environments, starting approximately 11.9.2025 10.00 EEST.

MehdiJafari · ‎09-12-2025

Hi, We’re seeing the same issue on several queue based ingestion jobs failing a couple hundreds of tasks. It was intermittent yesterday (10 Sep 2025) as in a few random tasks would fail in each run but the issue has now spread out to all tasks failing all of them at all runs. I’ve given the service principal all the roles suggested above but to no avail. I suspect it could have to do with a change in the Azure Event Grid response structure.

Malthe · ‎09-12-2025

I'm seeing these two updates from Microsoft on 10 Sep 2025:

Both seem like candidates.

Saska · ‎09-13-2025

Falling back to file listing mode worked as band-aid, but dont see that as long term solution due to costs related to calling file listing operations (especialy with large number of files).

In practice - i removed the event grid related options from the streamreader:

cloudFiles.useNotifications
cloudFiles.resourceGroup
cloudFiles.subscriptionId
cloudFiles.clientId
cloudFiles.clientSecret
cloudFiles.tenantId

My top candidate would then be the Azure Storage API changes shared by @Malthe