Calling exe from notebook

nkrish
Visitor

How to call exe (c-sharp code based) from data bricks notebook?

#csharp exe

 

siva-anantha
Contributor

@nkrish: Databricks clusters typically run Linux, so a Windows-only executable (.exe) will not run there. 

Even if you port the executable for Linux, it will run in the driver directly and you might miss the distributed computing feature of the Spark. 

Could you please provide the following details?
1) purpose of .exe - what it does?
2) whether you need Spark parallelism

3) your Databricks cloud and runtime version

nkrish
Visitor

@siva-anantha , We are working on some migration project where some core business logics are in c# code as exe. We have 2 options either rewrite the code into pyspark or use the exe as it is. Since we have a very limited timeline, we are thinking of calling the exe as it is from notebook. Please find my response inline below. 

1) purpose of .exe - what it does?

some core business logics like some calculations, filtering and brand scanning all happens in the exe
2) whether you need Spark parallelism

Not very sure as we have not gone through the code fully. 

 

3) your Databricks cloud and runtime version
Azure Databricks and Databricks Runtime is 17.x
If Parallelism is not needed, is it possible to execute exe if so how to do that? 

@nkrish: Thank you for the response

If you don't need Spark parallelism, can you review these options?

Option-1: APIfy the exe in a separate windows vm/container
       - call the API from Databricks. this call will be executed from the driver. 

Option-2: Recompile exe for Linux
        
        - compile C# for Linux platform
        - install .NET runtime for Linux in Databricks Runtime using init script
        - copy the exe from storage to cluster using init script
        - execute the exe using %sh or python subprocess or similar approaches in the notebook

Both are anti-patterns. Option-1 is less invasive and will take less efforts compared to Option-2.

Once you review and understand internals of exe, think about rewrite/porting to Spark APIs for better maintenance and performance. 

mukul1409
New Contributor

Hi @nkrish 
Databricks notebooks cannot run a Windows based C sharp executable. Databricks compute runs on Linux and does not support executing native Windows exe files. Because of this, a C sharp exe cannot be called directly from a Databricks notebook. The supported pattern is to run the exe outside Databricks and integrate through a service or API, or refactor the logic so it can be executed in a cross platform way or reimplemented using Spark or Python inside Databricks.

Mukul Chauhan