cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Error java.lang.NullPointerException using Autoloader

Maxi1693
New Contributor II

Hi!

I am pulling data from a Blob storage to Databrick using Autoloader. This process is working well for almost 10 resources, but for a specific one I am getting this error  java.lang.NullPointerException.

Looks like this issue in when I connect to the blob storage, but when I try to connect to this resource using spark.read.parquet("/mnt/path/to/files/*.parquet") the process works well.

So the issue is when I am runninng the Structure Streaming with format "couldFiles".

Below the code used:

 

downtimeuptime_df = (
  spark.readStream.format("cloudFiles")
    .option("cloudFiles.format", "parquet")
    .option("cloudFiles.schemaLocation", f"/mnt/hist_data_delta/hist_data_delta.db/checkpoints/table_name_data_hmc")
    .option("cloudFiles.schemaEvolutionMode", None)
    .load(f'/mnt/source_data_bu/table_name_data/')
    .select(
      "*",
      lit(_bu).alias("_bu"),
      col("_metadata.file_path").alias("_source_file"),
      current_timestamp().alias("_processing_time"),
    )
)
 
Error description:
Py4JJavaError: An error occurred while calling o2702.load. : java.lang.NullPointerException at com.databricks.sql.cloudfiles.options.CloudFilesOptionsBase.$anonfun$userProvidedEvolutionMode$1(CloudFilesOptionsBase.scala:162) at scala.Option.map(Option.scala:230) at com.databricks.sql.cloudfiles.options.CloudFilesOptionsBase.<init>(CloudFilesOptionsBase.scala:162) at com.databricks.sql.fileNotification.autoIngest.CloudFilesSourceOptions.<init>(CloudFilesSourceOptions.scala:45) at com.databricks.sql.fileNotification.autoIngest.CloudFilesSourceProvider.sourceSchema(CloudFilesSourceProvider.scala:84) at org.apache.spark.sql.execution.datasources.DataSource.sourceSchema(DataSource.scala:266) at org.apache.spark.sql.execution.datasources.DataSource.sourceInfo$lzycompute(DataSource.scala:150) at org.apache.spark.sql.execution.datasources.DataSource.sourceInfo(DataSource.scala:150) at org.apache.spark.sql.execution.streaming.StreamingRelation$.apply(StreamingRelation.scala:40) at org.apache.spark.sql.streaming.DataStreamReader.loadInternal(DataStreamReader.scala:223) at org.apache.spark.sql.streaming.DataStreamReader.load(DataStreamReader.scala:267) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:397) at py4j.Gateway.invoke(Gateway.java:306) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:195) at py4j.ClientServerConnection.run(ClientServerConnection.java:115) at java.lang.Thread.run(Thread.java:750)
1 ACCEPTED SOLUTION

Accepted Solutions

shan_chandra
Honored Contributor III
Honored Contributor III

@Maxi1693  - The value for the schemaEvolutionMode should be a string. could you please try changing the below from

 .option("cloudFiles.schemaEvolutionMode", None)
 
   to 
 .option("cloudFiles.schemaEvolutionMode", "none")
   

and let us know.

Reference: https://docs.databricks.com/en/ingestion/auto-loader/schema.html#how-does-auto-loader-schema-evoluti...

View solution in original post

1 REPLY 1

shan_chandra
Honored Contributor III
Honored Contributor III

@Maxi1693  - The value for the schemaEvolutionMode should be a string. could you please try changing the below from

 .option("cloudFiles.schemaEvolutionMode", None)
 
   to 
 .option("cloudFiles.schemaEvolutionMode", "none")
   

and let us know.

Reference: https://docs.databricks.com/en/ingestion/auto-loader/schema.html#how-does-auto-loader-schema-evoluti...

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.