I have a working application for managing HDFS using WebHDFS. I need to be able to do this on a Kerberos secured cluster.
The problem is, that there is no library or
Using Java code plus the Hadoop Java API to open a Kerberized session, get the Delegation Token for the session, and pass that Token to the other app -- as suggested by @tellisnz -- has a drawback: the Java API requires quite a lot of dependencies (i.e. a lot of JARs, plus Hadoop native libraries). If you run you app on Windows, in particular, it will be a tough ride.
Another option is to use Java code plus WebHDFS to run a single SPNEGOed query and GET the Delegation Token, then pass it to the other app -- that option requires absolutely no Hadoop library on your server. The barebones version would be sthg like
URL urlGetToken = new URL("http://<host>:<port>/webhdfs/v1/?op=GETDELEGATIONTOKEN") ;
HttpURLConnection cnxGetToken =(HttpURLConnection) urlGetToken.openConnection() ;
BufferedReader httpMessage = new BufferedReader( new InputStreamReader(cnxGetToken.getInputStream()), 1024) ;
Pattern regexHasToken =Pattern.compile("urlString[\": ]+(.[^\" ]+)") ;
String httpMessageLine ;
while ( (httpMessageLine =httpMessage.readLine()) != null)
{ Matcher regexToken =regexHasToken.matcher(httpMessageLine) ;
if (regexToken.find())
{ System.out.println("Use that template: http://<Host>:<Port>/webhdfs/v1%AbsPath%?delegation=" +regexToken.group(1) +"&op=...") ; }
}
httpMessage.close() ;
That's what I use to access HDFS from a Windows Powershell script (or even an Excel macro). Caveat: with Windows you have to create your Kerberos TGT on the fly, by passing to the JVM a JAAS config pointing to the appropriate keytab file. But that caveat also applies to the Java API, anyway.
You could take a look at the hadoop-auth client and create a service which does the first connection, then you might be able to grab the 'Authorization' and 'X-Hadoop-Delegation-Token' headers and cookie from it and add it to your basic client's requests.
First you'll need to have either used kinit to authenticate your user for application before running. Otherwise, you're going to have to do a JAAS login for your user, this tutorial provides a pretty good overview on how to do that.
Then, to do the login to WebHDFS/HttpFS, we'll need to do something like:
URL url = new URL("http://youhost:8080/your-kerberised-resource");
AuthenticatedURL.Token token = new AuthenticatedURL.Token();
HttpURLConnection conn = new AuthenticatedURL().openConnection(url, token);
String authorizationTokenString = conn.getRequestProperty("Authorization");
String delegationToken = conn.getRequestProperty("X-Hadoop-Delegation-Token");
...
// do what you have to to get your basic client connection
...
myBasicClientConnection.setRequestProperty("Authorization", authorizationTokenString);
myBasicClientConnection.setRequestProperty("Cookie", "hadoop.auth=" + token.toString());
myBasicClientConnection.setRequestProperty("X-Hadoop-Delegation-Token", delegationToken);