问题
I'll start with what my program does. The index function of controller takes an array of URLs and keywords and stores them in DB. Now the crawlLink method with take all the keywords and URLs. The URLs are searched for all the keywords and the sublinks of all the URLs are generated and again stored in DB which are also searched for the keywords. Keywords are searched in each link using search method. The sublinks are extracted from all the URLs using extract_links function. search and extract_links both have a method called get_web_page which takes the complete content of the page using cURL. get_web_page is used once in search function to get content of web page so that keywords can be extracted from it. It is also used in extract_links function to extract links with valid page content.
Now crawlLink calls search function twice. Once to extract keywords from domain links and second time to extract keywords from sublinks. Hence, get_web_page is called thrice. It approximately takes 5 mins to get contents of around 150 links. And it is called thrice so it takes 15 minutes of processing time. During that duration nothing can be done. Thus, I want to run this process in background and show its status while processing. extract_links and get_web_page are included in the controller using include_once.
The get_web_page function is as follows:
function get_web_page( $url )
{
$options = array(
CURLOPT_RETURNTRANSFER => true, // return web page
CURLOPT_HEADER => false, // don't return headers
CURLOPT_FOLLOWLOCATION => true, // follow redirects
CURLOPT_ENCODING => "", // handle compressed
CURLOPT_USERAGENT => "spider", // who am i
CURLOPT_AUTOREFERER => true, // set referer on redirect
CURLOPT_CONNECTTIMEOUT => 120, // timeout on connect
CURLOPT_TIMEOUT => 120, // timeout on response
CURLOPT_MAXREDIRS => 10, // stop after 10 redirects
);
$ch = curl_init( $url );
curl_setopt_array( $ch, $options );
$content = curl_exec( $ch );
$err = curl_errno( $ch );
$errmsg = curl_error( $ch );
$header = curl_getinfo( $ch );
curl_close( $ch );
$header['errno'] = $err;
$header['errmsg'] = $errmsg;
$header['content'] = $content;
return $header;
}
An input of URLs and keywords once from the user can be considered as a task. Now this task can be started and it will start running in the background. At the same time another task can be defined and can be started. Each task will have statuses like "To Do", "In Progress", "Pending", "Done", etc. The Simple Task Board by Oscar Dias is the exact way I want the tasks to be displayed.
I read about so many ways to run function in background that now I am in a dilemma about which approach to adopt. I read about exec, pcntl_fork, Gearman and other but all need CLI which I don't want to use. I tried installing Gearman with Cygwin but got stuck in Gearman installation as it cannot find libevent. I've installed libevent separately but still it doesn't work. And Gearman needs CLI so dropped it. I don't want to use CRON also. I just want to know which approach will be best in my scenario.
I am using PHP 5.3.8 | Codeigniter 2.1.3 | Apache 2.2.21 | MySQL 5.5.16 | Windows 7 64 bit
回答1:
Your problem is, Windows.
windows is simply not very good for running background tasks & cron jobs - there are tools you can find, but they are limited.
However, are you sure you even need this? Most servers are Linux, so why don't you just test on Windows & move over.
--
The second part is command line - you need it if you want to start a new process (which you do). But ti isn't really very scary. CodeIgniter is quite simple:
http://ellislab.com/codeigniter/user-guide/general/cli.html
回答2:
You can run using nohup process or using cron job.............Please go through below links
nohup: run PHP process in background
Running a php5 background process under Linux
https://nsaunders.wordpress.com/2007/01/12/running-a-background-process-in-php/
回答3:
The above approach that I was trying to achieve didn't seem possible to be implemented in Windows. Many methods listed in the questions are either removed or modified. I then moved on to a workaround involving use of AJAX
.
I execute the controller method as an ajax request and give a count to it which increments with each new AJAX request. Each request can be aborted though the processing will continue but ultimately results matter in my project even if they are taken incomplete. And if the browser is open then that request may complete and later on the user can see the complete result.
On stopping the processing of a task a CANCELLED icon is shown and a link pointing to result page is shown which displays the results generated before the task was cancelled. On AJAX fails or AJAX success I send back the count of the task from server to client which was sent by the client to server. Thus results are displayed for a unique task and don't get messed up.
But there is no tracking of how much a certain task has progressed. The time taken for execution cannot be identified. Thus, this approach works for me but has some drawbacks. The main aim was the user should not be waiting while some task is in progress and that is somehow achieved by the above workaround.
来源:https://stackoverflow.com/questions/14681304/run-controller-methods-in-background-codeigniter-windows