I am trying to grab my amazon associates stats automatically via cUrl. However I am falling down at the first hurdle; logging in.
When I use the following code:
You need to get amazon to set the cookie first.
Try:
// 1. Create a cookie file and set basic params
$ckfile = tempnam ("/your/path/to/cookie/folder", "cookie.txt");
$target_host = "https://affiliate-program.amazon.com";
$target_request = "/gp/flex/sign-in/select.html";
$post_data = "action=sign-in&email=$username&password=$password";
// 2. Visit homepage to set cookie
$ch = curl_init ($target_host);
curl_setopt ($ch, CURLOPT_COOKIEJAR, $ckfile);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, true);
$output = curl_exec ($ch);
// 3. Continue
$login = curl_init ($target_host.$target_request);
curl_setopt($login, CURLOPT_COOKIESESSION, 1);
curl_setopt($login, CURLOPT_COOKIEJAR, $ckfile);
curl_setopt($login, CURLOPT_COOKIEFILE, $ckfile);
curl_setopt($login, CURLOPT_TIMEOUT, 40);
curl_setopt($login, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($login, CURLOPT_HEADER, 1);
curl_setopt($login, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']);
curl_setopt($login, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($login, CURLOPT_POST, 1);
curl_setopt($login, CURLOPT_POSTFIELDS, $post_data);
echo curl_exec($login);
curl_close($login);
EDIT: This code is broken as of June 2016. See this answer for explanation and potential workaround. The same technology mentioned in the previous link was added to associates' login.
I wrote this code up and it works well for me, in the last var_dump I see all my account info and things like that. If you don't delete the cookies, you can make subsequent curl requests to protected pages with your login.
Hopefully this can help you learn about how to do it. A lot of times on big sites you need to visit the login page to get cookies set, and also they usually have csrf tokens on the forms you need to submit with them.
Of course if amazon changes their forms or url's around a bit, this will have to be adapted some, but hopefully they don't do that too often.
<?php
$email = 'you@yoursite.com';
$password = 'password';
// initial login page which redirects to correct sign in page, sets some cookies
$URL = 'https://affiliate-program.amazon.com/gp/associates/join/landing/main.html';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $URL);
curl_setopt($ch, CURLOPT_COOKIEJAR, 'amazoncookie.txt');
curl_setopt($ch, CURLOPT_COOKIEFILE, 'amazoncookie.txt');
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:35.0) Gecko/20100101 Firefox/35.0');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
//curl_setopt($ch, CURLOPT_VERBOSE, true);
curl_setopt($ch, CURLOPT_STDERR, fopen('php://stdout', 'w'));
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
$page = curl_exec($ch);
//var_dump($page);exit;
// try to find the actual login form
if (!preg_match('/<form name="sign_in".*?<\/form>/is', $page, $form)) {
die('Failed to find log in form!');
}
$form = $form[0];
// find the action of the login form
if (!preg_match('/action=(?:\'|")?([^\s\'">]+)/i', $form, $action)) {
die('Failed to find login form url');
}
$URL2 = $action[1]; // this is our new post url
// find all hidden fields which we need to send with our login, this includes security tokens
$count = preg_match_all('/<input type="hidden"\s*name="([^"]*)"\s*value="([^"]*)"/i', $form, $hiddenFields);
$postFields = array();
// turn the hidden fields into an array
for ($i = 0; $i < $count; ++$i) {
$postFields[$hiddenFields[1][$i]] = $hiddenFields[2][$i];
}
// add our login values
$postFields['username'] = $email;
$postFields['password'] = $password;
$post = '';
// convert to string, this won't work as an array, form will not accept multipart/form-data, only application/x-www-form-urlencoded
foreach($postFields as $key => $value) {
$post .= $key . '=' . urlencode($value) . '&';
}
$post = substr($post, 0, -1);
// set additional curl options using our previous options
curl_setopt($ch, CURLOPT_URL, $URL2);
curl_setopt($ch, CURLOPT_REFERER, $URL);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $post);
$page = curl_exec($ch); // make request
var_dump($page); // should be logged in