I need to make a proxy script that can access a page hidden behind a login screen. I do not need the proxy to \"simulate\" logging in, instead the login page HTML should be
You could check out http://code.google.com/p/php-transparent-proxy/ , I made it because I was asking myself that exact same question and I decided to make one. It's under BSD license, so have fun :)
I would recommand using Curl (php library that you might need to activate in your php.ini) It's used to manipulate remote websites, handling cookies and every http parameters you need. You'll have to write your proxy based on the web pages you're hitting, but it'll make the job.
Have your PHP script request the URL you want, and rewrite all links and form actions to point back to your php script. When receiving requests to the script that have a URL parameter, forward that to the remote server and repeat.
You won't be able to catch all JavaScript requests, (unless you implemented a JavaScript portion of your "proxy")
Eg: User types http://example.com/login.php into your proxy form.
send the user to http://yoursite.com/proxy.php?url=http://example.com/login.php
make sure to urlencode the parameter "http://example.com/login.php"
In http://yoursite.com/proxy.php, you make an HTTP request to http://example.com/login.php
$url = $_REQUEST['url'];
// make sure we have a valid URL and not file path
if (!preg_match("`https?\://`i", $url)) {
die('Not a URL');
}
// make the HTTP request to the requested URL
$content = file_get_contents($url);
// parse all links and forms actions and redirect back to this script
$content = preg_replace("/some-smart-regex-here/i", "$1 or $2 smart replaces", $content);
echo $content;
Note that /some-smart-regex-here/i is actually a regex expression you should write to parse links, and such.
The example just proxies the HTTP Body, you may want to proxy the HTTP Headers. You can use fsockopen() or PHP stream functions in PHP5+ (stream_socket_client() etc.)
What you are talking about is accessing pages for which you need to authenticate yourself.
Here are a few things that must be laid down:
The key point is that you cannot gain access without authenticating yourself first.
As for language, it is pretty doable in PHP. And as the tags on the question suggest, you are using the right tools to do that job already.
One thing I would like to know is, why are you calling it a "proxy"? do you want to serve the content to other users?
EDIT: [update after comment]
In that case, use phproxy. It does what you want, along with a host of other features.