Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-27-2024 10:08 AM
I hope this should work
JavaPairRDD<String, PortableDataStream> jrdd = javaSparkContext.binaryFiles("<path_to_file>");
Map<String, PortableDataStream> mp = jrdd.collectAsMap();
OutputStream os = new FileOutputStream(f);
mp.values().forEach(pd -> {
try {
os.write(pd.toArray());
} catch (IOException e) {
throw new RuntimeException(e);
}
});
os.flush();
And then supplying file to jaxb unmarshaller. Not sure if there is a better way.