Hello, thank you for the reply. I've verified both of those two points. The pipeline works if I just use regular python functions but gives the error if I use a custom Data Source or a UDF. What I'm guessing is that both of these use different Spark Contexts than what is setup at the start by the pipeline so my sys.path.append doesn't take affect anymore?