问题
I have a Spark Streaming application built with Maven (as jar) and deployed with the spark-submit
script. The application project layout follows the standard directory layout:
myApp
src
main
scala
com.mycompany.package
MyApp.scala
DoSomething.scala
...
resources
aPerlScript.pl
...
test
scala
com.mycompany.package
MyAppTest.scala
...
target
...
pom.xml
In the DoSomething.scala
object I have a method (let's call it doSomething()
) that tries to execute a Perl script -- aPerlScript.pl
(from the resources
folder) -- using scala.sys.process.Process
and passing two arguments to the script (the first one is the absolute path to a binary file used as input, the second one is the path/name of the produced output file). I call then DoSomething.doSomething()
.
The issue is that I was not able to access the script, not with absolute paths, relative paths, getClass.getClassLoader.getResource, getClass.getResource, I have specified the resources folder in my pom.xml
. None of my attempts succeeded. I don't know how to find the stuff I put in src/main/resources.
I will appreciate any help.
SIDE NOTES:
- I use an external Process instead of a Spark pipe because, at this step of my workflow, I must handle binary files as input and output.
- I'm using Spark-streaming 1.1.0, Scala 2.10.4 and Java 7. I build the jar with "Maven install" from within Eclipse (Kepler)
- When I use the
getClass.getClassLoader.getResource
"standard" method to access resources I find that the actual classpath is thespark-submit
script's one.
回答1:
There are a few solutions. The simplest is to use Scala's process infrastructure:
import scala.sys.process._
object RunScript {
val arg = "some argument"
val stream = RunScript.getClass.getClassLoader.getResourceAsStream("aPerlScript.pl")
val ret: Int = (s"/usr/bin/perl - $arg" #< stream).!
}
In this case, ret
is the return code for the process and any output from the process is directed to stdout
.
A second (longer) solution is to copy the file aPerlScript.pl
from the jar file to some temporary location and execute it from there. This code snippet should have most of what you need.
object RunScript {
// Set up copy destination from the Java temporary directory. This is /tmp on Linux
val destDir = System.getProperty("java.io.tmpdir") + "/"
// Get a stream to the script in the resources dir
val source = Channels.newChannel(RunScript.getClass.getClassLoader.getResourceAsStream("aPerlScript.pl"))
val fileOut = new File(destDir, "aPerlScript.pl")
val dest = new FileOutputStream(fileOut)
// Copy file to temporary directory
dest.getChannel.transferFrom(source, 0, Long.MaxValue)
source.close()
dest.close()
}
// Schedule the file for deletion for when the JVM quits
sys.addShutdownHook {
new File(destDir, "aPerlScript.pl").delete
}
// Now you can execute the script.
This approach allows you to bundle native libraries in JAR files. Copying them out allows the libraries to be loaded at runtime for whatever JNI mischief you have planned.
来源:https://stackoverflow.com/questions/26061096/how-to-access-static-resources-in-jar-that-correspond-to-src-main-resources-fol