- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
3 hours ago
@nkrish: Thank you for the response
If you don't need Spark parallelism, can you review these options?
Option-1: APIfy the exe in a separate windows vm/container
- call the API from Databricks. this call will be executed from the driver.
Option-2: Recompile exe for Linux
- compile C# for Linux platform
- install .NET runtime for Linux in Databricks Runtime using init script
- copy the exe from storage to cluster using init script
- execute the exe using %sh or python subprocess or similar approaches in the notebook
Both are anti-patterns. Option-1 is less invasive and will take less efforts compared to Option-2.
Once you review and understand internals of exe, think about rewrite/porting to Spark APIs for better maintenance and performance.