cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
cancel
Showing results for 
Search instead for 
Did you mean: 

Getting Py4J "Could not find py4j jar" error when trying to use pypmml, solution doesn't work

mattsteinpreis
New Contributor III

I'm trying to use pypmml in a DB notebook, but I'm getting the known `Error : Py4JError: Could not find py4j jar at` error. I've followed the solution here: https://kb.databricks.com/libraries/pypmml-fail-find-py4j-jar.html. However, this has not worked for me.

Details:

  • When I run `%pip install py4j==0.10.9` followed by `%sh find /databricks/ -name "py4j*jar"`, no results are found. However, if I install to the cluster via the Compute UI, then I do find the jar in the expected path.
  • I move the jar via:
dbutils.fs.cp('/databricks/python3/share/py4j/py4j0.10.9.jar', '/py4j/')

  • I create the init script, like so:
dbutils.fs.put("/<my-path>/install-py4j-jar.sh", """
 
#!/bin/bash
mkdir -p /share/py4j/ /current-release/
cp /dbfs/py4j/py4j0.10.9.jar /share/py4j/
cp /dbfs/py4j/py4j0.10.9.jar /current-release
""", True)
  • I attach init script, and restart.
  • I install pypmml and run something like:
from pypmml import Model
model = Model.load('/dbfs/<my-path>/<my-model>.pmml')

  • I've tried installing pypmml using %pip as well as in the cluster UI.

No matter what, I always get the same error: Py4JError: Could not find py4j jar at

I'm using DRV 10.4 LTS ML, though I've tried other versions to no avail.

Any ideas?

4 REPLIES 4

Hubert-Dudek
Esteemed Contributor III

To avoid conflict with preinstalled version, py4j needs to be installed via %pip install py4j==0.10.9.

you can try this way to check where it is installed:

%sh
pip install py4j==0.10.9
pip show py4j

This didn't fix the problem.

When I do this, pypmml does "see" this version, as when I later install pypmml, it skips the py4j requirement install and cites the py4j location from the show command. However, I still get the same error.

I don't know how to make pypmml know where to look to find the right py4j jar.

Also, even when I install py4j in this way, the databricks environment still seems to point to a different py4j install. If I run:

import py4j
print(py4j.__version__)
print(py4j.__file__)

I get a different version and path than what was specified/returned from the install commands.

Hi @Matthew Steinpreis​,

Just a friendly follow-up. Are you still looking for help? please let us know

pawelmitrus
New Contributor III

I've been struggling myslef with it, but after installing pypmml for spark, I can use the other library, maybe it will work for you:

both pyspark & scala works

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.