Calling exe from notebook
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
6 hours ago
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
4 hours ago
@nkrish: Databricks clusters typically run Linux, so a Windows-only executable (.exe) will not run there.
Even if you port the executable for Linux, it will run in the driver directly and you might miss the distributed computing feature of the Spark.
Could you please provide the following details?
1) purpose of .exe - what it does?
2) whether you need Spark parallelism
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
4 hours ago
@siva-anantha , We are working on some migration project where some core business logics are in c# code as exe. We have 2 options either rewrite the code into pyspark or use the exe as it is. Since we have a very limited timeline, we are thinking of calling the exe as it is from notebook. Please find my response inline below.
1) purpose of .exe - what it does?
some core business logics like some calculations, filtering and brand scanning all happens in the exe
2) whether you need Spark parallelism
Not very sure as we have not gone through the code fully.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
3 hours ago
@nkrish: Thank you for the response
If you don't need Spark parallelism, can you review these options?
Option-1: APIfy the exe in a separate windows vm/container
- call the API from Databricks. this call will be executed from the driver.
Option-2: Recompile exe for Linux
- compile C# for Linux platform
- install .NET runtime for Linux in Databricks Runtime using init script
- copy the exe from storage to cluster using init script
- execute the exe using %sh or python subprocess or similar approaches in the notebook
Both are anti-patterns. Option-1 is less invasive and will take less efforts compared to Option-2.
Once you review and understand internals of exe, think about rewrite/porting to Spark APIs for better maintenance and performance.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
3 hours ago
Hi @nkrish
Databricks notebooks cannot run a Windows based C sharp executable. Databricks compute runs on Linux and does not support executing native Windows exe files. Because of this, a C sharp exe cannot be called directly from a Databricks notebook. The supported pattern is to run the exe outside Databricks and integrate through a service or API, or refactor the logic so it can be executed in a cross platform way or reimplemented using Spark or Python inside Databricks.