Re: java.lang.IllegalArgumentException: java.net.U...

Alexey · ‎11-21-2022

By reloading I mean to load all the existing data in that folder. As mentioned above:

``

autoloader = spark.readStream.format("cloudFiles") \

.option("cloudFiles.format", data_format) \

.option("header", "true") \

.option("cloudFiles.schemaLocation", schema_location) \

.option("cloudFiles.allowOverwrites", "true") \

.load(path)

``

in the second case, where Autloader will fail (at least we know from experience, that it does with the colon in the file names), we use simple data load:

``

df = spark.read.format(data_format)\

.option("header", "true") \

.load(path)

``

That is why I mentioned that luckily for us, this data folder is not that huge and it works fast.