How to call a java function from python/numpy?

后端 未结 3 406
陌清茗
陌清茗 2020-12-30 12:26

it is clear to me how to extend Python with C++, but what if I want to write a function in Java to be used with numpy?

Here is a simple scenario: I want to compute

相关标签:
3条回答
  • 2020-12-30 12:46

    I'm not sure about numpy support, but the following might be helpful:

    http://pypi.python.org/pypi/JCC/

    0 讨论(0)
  • 2020-12-30 12:50

    I consider Jython to be one of the best options - which makes it seamless to use java objects in python. I actually integrated weka with my python programs, and it was super easy. Just import the weka classes and call them as you would in java within the python code.

    http://www.jython.org/

    0 讨论(0)
  • 2020-12-30 12:51

    I spent some time on my own question and would like to share my answer as I feel there is not much information on this topic on stackoverflow. I also think Java will become more relevant in scientific computing (e.g. see WEKA package for data mining) because of the improvement of performance and other good software development features of Java.


    In general, it turns out that using the right tools it is much easier to extend Python with Java than with C/C++!


    Overview and assessment of tools to call Java from Python

    • http://pypi.python.org/pypi/JCC: because of no proper documentation this tool is useless.

    • Py4J: requires to start the Java process before using python. As remarked by others this is a possible point of failure. Moreover, not many examples of use are documented.

    • JPype: although development seems to be death, it works well and there are many examples on it on the web (e.g. see http://kogs-www.informatik.uni-hamburg.de/~meine/weka-python/ for using data mining libraries written in Java) . Therefore I decided to focus on this tool.

    Installing JPype on Fedora 16

    I am using Fedora 16, since there are some issues when installing JPype on Linux, I describe my approach. Download JPype, then modify setup.py script by providing the JDK path, in line 48:

    self.javaHome = '/usr/java/default'
    

    then run:

    sudo python setup.py install
    

    Afters successful installation, check this file:

    /usr/lib64/python2.7/site-packages/jpype/_linux.py

    and remove or rename the method getDefaultJVMPath() into getDefaultJVMPath_old(), then add the following method:

    def getDefaultJVMPath():
        return "/usr/java/default/jre/lib/amd64/server/libjvm.so"
    

    Alternative approach: do not make any change in the above file _linux.py, but never use the method getDefaultJVMPath() (or methods which call this method). At the place of using getDefaultJVMPath() provide directly the path to the JVM. Note that there are several paths, for example in my system I also have the following paths, referring to different versions of the JVM (it is not clear to me whether the client or server JVM is better suited):

    • /usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre/lib/x86_64/client/libjvm.so
    • /usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre/lib/x86_64/server/libjvm.so
    • /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/lib/amd64/server/libjvm.so

    Finally, add the following line to ~/.bashrc (or run it each time before opening a python interpreter):

    export JAVA_HOME='/usr/java/default'
    

    (The above directory is in reality just a symbolic link to my last version of JDK, which is located at /usr/java/jdk1.7.0_04).

    Note that all the tests in the directory where JPype has been downloaded, i.e. JPype-0.5.4.2/test/testsuite.py will fail (so do not care about them).

    To see if it works, test this script in python:

    import jpype 
    jvmPath = jpype.getDefaultJVMPath() 
    jpype.startJVM(jvmPath)
    # print a random text using a Java class
    jpype.java.lang.System.out.println ('Berlusconi likes women') 
    jpype.shutdownJVM() 
    

    Calling Java classes from Java also using Numpy

    Let's start implementing a Java class containing some functions which I want to apply to numpy arrays. Since there is no concept of state, I use static functions so that I do not need to create any Java object (creating Java objects would not change anything).

    /**
     * Cookbook to pass numpy arrays to Java via Jpype
     * @author Mannaggia
     */
    
    package test.java;
    
    public class Average2 {
    
    public static double compute_average(double[] the_array){
        // compute the average
        double result=0;
        int i;
        for (i=0;i<the_array.length;i++){
            result=result+the_array[i];
        }
        return result/the_array.length;
    }
    // multiplies array by a scalar
    public static double[] multiply(double[] the_array, double factor) {
    
        int i;
        double[] the_result= new double[the_array.length];
        for (i=0;i<the_array.length;i++) {
            the_result[i]=the_array[i]*factor;
        }
        return the_result;
    }
    
    /**
     * Matrix multiplication. 
     */
    public static double[][] mult_mat(double[][] mat1, double[][] mat2){
        // find sizes
        int n1=mat1.length;
        int n2=mat2.length;
        int m1=mat1[0].length;
        int m2=mat2[0].length;
        // check that we can multiply
        if (n2 !=m1) {
            //System.err.println("Error: The number of columns of the first argument must equal the number of rows of the second");
            //return null;
            throw new IllegalArgumentException("Error: The number of columns of the first argument must equal the number of rows of the second");
        }
        // if we can, then multiply
        double[][] the_results=new double[n1][m2];
        int i,j,k;
        for (i=0;i<n1;i++){
            for (j=0;j<m2;j++){
                // initialize
                the_results[i][j]=0;
                for (k=0;k<m1;k++) {
                    the_results[i][j]=the_results[i][j]+mat1[i][k]*mat2[k][j];
                }
            }
        }
        return the_results;
    }
    
    /**
     * @param args
     */
    public static void main(String[] args) {
        // test case
        double an_array[]={1.0, 2.0,3.0,4.0};
        double res=Average2.compute_average(an_array);
        System.out.println("Average is =" + res);
    }
    }
    

    The name of the class is a bit misleading, as we do not only aim at computing the average of a numpy vector (using the method compute_average), but also multiply a numpy vector by a scalar (method multiply), and finally, the matrix multiplication (method mult_mat).

    After compiling the above Java class we can now run the following Python script:

    import numpy as np
    import jpype
    
    jvmPath = jpype.getDefaultJVMPath() 
    # we to specify the classpath used by the JVM
    classpath='/home/mannaggia/workspace/TestJava/bin'
    jpype.startJVM(jvmPath,'-Djava.class.path=%s' % classpath)
    
    # numpy array
    the_array=np.array([1.1, 2.3, 4, 6,7])
    # build a JArray, not that we need to specify the Java double type using the jpype.JDouble wrapper
    the_jarray2=jpype.JArray(jpype.JDouble, the_array.ndim)(the_array.tolist())
    Class_average2=testPkg.Average2 
    res2=Class_average2.compute_average(the_jarray2)
    np.abs(np.average(the_array)-res2) # ok perfect match! 
    
    # now try to multiply an array
    res3=Class_average2.multiply(the_jarray2,jpype.JDouble(3))
    # convert to numpy array
    res4=np.array(res3) #ok
    
    # matrix multiplication
    the_mat1=np.array([[1,2,3], [4,5,6], [7,8,9]],dtype=float)
    #the_mat2=np.array([[1,0,0], [0,1,0], [0,0,1]],dtype=float)
    the_mat2=np.array([[1], [1], [1]],dtype=float)
    the_mat3=np.array([[1, 2, 3]],dtype=float)
    
    the_jmat1=jpype.JArray(jpype.JDouble, the_mat1.ndim)(the_mat1.tolist())
    the_jmat2=jpype.JArray(jpype.JDouble, the_mat2.ndim)(the_mat2.tolist())
    res5=Class_average2.mult_mat(the_jmat1,the_jmat2)
    res6=np.array(res5) #ok
    
    # other test
    the_jmat3=jpype.JArray(jpype.JDouble, the_mat3.ndim)(the_mat3.tolist())
    res7=Class_average2.mult_mat(the_jmat3,the_jmat2)
    res8=np.array(res7)
    res9=Class_average2.mult_mat(the_jmat2,the_jmat3)
    res10=np.array(res9)
    
    # test error due to invalid matrix multiplication
    the_mat4=np.array([[1], [2]],dtype=float)
    the_jmat4=jpype.JArray(jpype.JDouble, the_mat4.ndim)(the_mat4.tolist())
    res11=Class_average2.mult_mat(the_jmat1,the_jmat4)
    
    jpype.java.lang.System.out.println ('Goodbye!') 
    jpype.shutdownJVM() 
    
    0 讨论(0)
提交回复
热议问题