Accessing kerberos protected webhdfs from .Net Application(console)

你。 提交于 2019-12-22 09:56:42

问题


I'm unable to access WebHDFS from browser due to Kerberos security. Can anyone help me with this?

Below is the error in browser for “http://****.****/webhdfs/v1/prod/snapshot_rpx/archive?op=LISTSTATUS&user.name=us”

HTTP ERROR 401

Problem accessing /webhdfs/v1/prod/snapshot_rpx/archive. Reason: Authentication required

.Net code for making a request to this URL

HttpWebRequest http = (HttpWebRequest)WebRequest.Create(requestUri);
    http.Timeout = timeout;
    http.ContentType = contentType;

    string responseData = string.Empty;
    using (WebResponse response = http.GetResponse())
    {
        Stream stream = response.GetResponseStream();
        StreamReader sr = new StreamReader(stream);
        responseData = sr.ReadToEnd();
    }

    return responseData;

回答1:


[Important notice] this answer applies to a plain Hadoop cluster using a Linux KDC (typically MIT Kerberos). For a Cloudera cluster relying on Microsoft Active Directory KDC, any .Net HTTP connector can achieve SPNEGO using Microsoft SSPI protocol (sooo boring...)

~~~~

The only way I know to access WebHDFS from the Microsoft world is an ugly and complex workaround:

  • install MIT Kerberos for Windows utility on the machine that will actually connect to HDFS, plus the appropriate Kerberos5 config file
  • make sure that your JVM has the "unlimited strength cryptography" security policy installed (separate download, duh)
  • develop a small Java utility that connects to WebHDFS service (on the NameNode) using SPNEGO with a GSSAPI Kerberos ticket

Option 1: create the ticket thru GUI, and tell Java to fetch it in the default cache

Option 2: tell Java to create its own ticket automatically, using a keytab file (must be created on Linux with ktutil; no such utility in the Windows package), and ignore the cache

  • make your Java code run a single GET, to retrieve a HDFS delegation token for this session, then dump the token to StdOut, then exit
  • make your .Net code run the Java utility, capture StdOut, and retrieve the token
  • connect to WebHDFS (NameNode + eventual redirects to the DataNodes) without SPNEGO, but inserting the token on the URL as a proof of pre-authentication

So in the end it's a Java problem. And setting up a working Kerberos config is incredibly tricky (cf. "Madness beyond the Gate", the current reference site about Kerberos implementation issues in the Hadoop ecosystem)




回答2:


Sorry for the delayed response. Apache Knox may actually provide the solution that you are looking for. It shields the REST clients from the details of how the Hadoop cluster itself is secured. The cluster can go from secured to unsecured on a whim and the clients will authenticated to the Knox Gateway the same way.

The question is how exactly that you would like to authenticate to Knox. The typical way is through HTTP Basic Auth against LDAP (which could be AD). There are however other authentication/federation providers to allow for other mechanisms as well.

The Header based preauth SSO provider is a decent way to go for web app type usecases. See: http://knox.apache.org/books/knox-0-7-0/user-guide.html#Preauthenticated+SSO+Provider

Coupled with SSL mutual authentication (http://knox.apache.org/books/knox-0-7-0/user-guide.html#Mutual+Authentication+with+SSL) between the application and Apache Knox this is an effective way to leverage Knox's role as a trusted proxy for Hadoop to federate the identity established in your application.

The upcoming v0.8.0 release introduces more SSO mechanisms as well.

Hadoop REST clients shouldn't need to know so many details about the Hadoop cluster that when the flexibility of Hadoop allows services to move or security to be enabled in different ways that all of the clients break. Forcing SPNEGO on every browser is a show stopper for many. Apache Knox addresses these issues in a way that REST API developers/consumers are accustomed to working.



来源:https://stackoverflow.com/questions/33878290/accessing-kerberos-protected-webhdfs-from-net-applicationconsole

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!