Bradley
New Contributor III

Thank you for the support. Yes, I was able to find a working solution.

  1. I placed the files into the distributed file system, dbfs. For others, this can be done manually using the databricks cli, or using the init scripts. In this case I found it easier to use the databricks cli.
  2. I appended the path to the binary file to the OS environment path. I'm using python so for me it looks like this:
os.environ['PATH'] += ':/dbfs/FileStore/salmon/bin/'

And that was it! I stored the file paths to the input data in a dataframe, then used spark to iterate across the rows of the dataframe calling a custom function that calls the binary file.