Thank you for the support. Yes, I was able to find a working solution.
- I placed the files into the distributed file system, dbfs. For others, this can be done manually using the databricks cli, or using the init scripts. In this case I found it easier to use the databricks cli.
- I appended the path to the binary file to the OS environment path. I'm using python so for me it looks like this:
os.environ['PATH'] += ':/dbfs/FileStore/salmon/bin/'
And that was it! I stored the file paths to the input data in a dataframe, then used spark to iterate across the rows of the dataframe calling a custom function that calls the binary file.