For a project I\'m trying to get data from a website only acessible when you\'re logged in from the site Goodreads.com. I\'m new to Jsoup, since I\'m using it only for this
You can log in with this code:
public static void main(String[] args) throws Exception {
Connection.Response execute = Jsoup
.connect("https://www.goodreads.com/")
.method(Connection.Method.GET).execute();
Element sign_in = execute.parse().getElementById("sign_in");
String authenticityToken = sign_in.select("input[name=authenticity_token]").first().val();
String n = sign_in.select("input[name=n]").first().val();
Document document = Jsoup.connect("https://www.goodreads.com/user/sign_in")
.data("cookieexists", "✓")
.data("authenticity_token", authenticityToken)
.data("user[email]", "user@email.com")
.data("user[password]", "password")
.data("remember_me", "on")
.data("n", n)
.cookies(execute.cookies())
.post();
}
Some remarks about the way I found this out:
The first thing you need to realise is that you're trying to recreate the exact same requests your browser does with Jsoup. So, in order to check whether what you have right now will work, you can try to recreate the exact same situation with your browser.
To recreate your code, I went to the login page, then I deleted all my Goodreads cookies (as you don't send along any cookies when you send the login request as well), and attempted to sign in with only passing the username and password form values. It gave an error that my session had timd out. When I first loaded the login page and then deleted all cookies except the session ID and did not remove the "n" form value, I could log in successfully. Therefore, you want to make a general GET request to the sign in page first, retrieve the session ID cookie you get there and the hidden form value, and pass it along with the POST request.
It could be that the API changed or that there just are several ways. Using Connection.Method.POST will do fine, in any case.
Yes, they refer to the names of the input boxes. This should be id, however, since name was used in the past and not all versions of all browsers supported passing the ids as data, most websites are just adding both. Either should be fine.
If you look at the source code of the sign in form, you can see that the "method" attribute of the form element is indeed the sign in page itself, so that's where it sends the request to.
PS. As a general tip, you can use the Firefox extension "Tamper Data" to remove form data or even cookies (though there are easier extensions for that).
See carefully what data is posted on login:
user[email]:email@email
remember_me:on
user[password]:plain_pasword
n:667387
So your post must execute exact same keys.
2.Make sure, you make right import: import org.jsoup.Connection.Method;
but Connection.Method.POST is still good.
3.See p1
4.Yes, you are correct
5.what is the question?