问题
I'm trying to run an hive script on AWS EMR using the php sdk. How can I pass the script parameters (like, input, output and dates to work on)?
Thanks
回答1:
If you are struggling with this as well...
A sample code for passing variables to hive script can be found at the following Amazon Forum Thread
回答2:
I've done this with the Java SDK, using the PHP SDK essentially what you need to do is parse in the parameters you want with add_job_flow_steps function
You need to add the parameters to the StepConfig (for the script you are running) in the "Args" array when calling the function.
Args - string|array - Optional - A list of command line arguments passed to the JAR file’s main function when executed. Pass a string for a single value, or an indexed array for multiple values.
The format of the arguments is a bit confusing, you need to have an array of the form
("-d","yourVariable=itsValue","-d","anotherVariable=AnotherValue")
So it should end up looking a bit like this:
add_job_flow_steps('j-19430859jg9',array( new CFStepConfig(array(
'Name' => 'Run a hive script',
'HadoopJarStep' => array( 'Jar' => CFHadoopStep::run_hive_script(),
'Args' => array("-d","yourVariable=itsValue","-d","anotherVariable=AnotherValue")
))))
I don't know if the syntax is quite right, I haven't tried it.
At least this is how it is for java, maybe for PHP you may need to have an associate array, I would try a variety of formats.
I expect this is so that these parameters are not confused with other hadoop/hive configuration parameters.
You can then access these variables in the script in a similar way to as in bash, using ${yourVariable}.
SELECT * FROM TABLE WHERE column='${yourVariable};
来源:https://stackoverflow.com/questions/9892620/pass-parameters-to-hive-script-using-aws-php-sdk