task-queue | 易学教程

NDB not clearing memory during a long request

阅读更多关于 NDB not clearing memory during a long request

I am currently offloading a long running job to a TaskQueue to calculate connections between NDB entities in the Datastore. Basically this queue handles several lists of entity keys that are to be related to another query by the node_in_connected_nodes function in the GetConnectedNodes node: class GetConnectedNodes(object): """Class for getting the connected nodes from a list of nodes in a paged way""" def __init__(self, list, query): # super(GetConnectedNodes, self).__init__() self.nodes = [ndb.model.Key('Node','%s' % x) for x in list] self.cursor = 0 self.MAX_QUERY = 100 # logging.info('Max

How to do OAuth-requiring operations in a GAE Task Queue?

阅读更多关于 How to do OAuth-requiring operations in a GAE Task Queue?

I have a simple Google App Engine app that includes a /update page which updates a YouTube playlist. It looks like this: class UpdatePage(webapp2.RequestHandler): @decorator.oauth_required def get(self): update_result = self.update_playlist() ... routes = [('/update', UpdatePage), (decorator.callback_path, decorator.callback_handler())] app = webapp2.WSGIApplication(routes, debug=True) It works as expected and the update_playlist() method does its job, but it turns out that under some circumstances it can run for a pretty long time, resulting in a DeadlineExceededError . So after reading about

Specifying retry limit for tasks queued using GAE deferred library

阅读更多关于 Specifying retry limit for tasks queued using GAE deferred library

We are offloading certain time consuming tasks using the GAE deferred library and would like to know how can we set the retry limit for those offloaded tasks. We are running into issues where certain tasks are retried for ever as the task would never succeed due to some un recoverable exception. According to the documentation the _retry_options of the deferred.defer API can be used to pass retry options to the associated Task() instance: _countdown, _eta, _headers, _name, _target, _transactional, _url, _retry_options , _queue: Passed through to the task queue - see the task queue documentation

asynchronous processing with PHP - one worker per job

阅读更多关于 asynchronous processing with PHP - one worker per job

Consider a PHP web application whose purpose is to accept user requests to start generic asynchronous jobs, and then create a worker process/thread to run the job. The jobs are not particularly CPU or memory intensive, but are expected to block on I/O calls fairly often. No more than one or two jobs should be started per second, but due to the long run times there may be many jobs running at once. Therefore, it's of utmost importance that the jobs run in parallel. Also, each job must be monitored by a manager daemon responsible for killing hung workers, aborting workers on user request, etc.

how to use q.js promises to work with multiple asynchronous operations

阅读更多关于 how to use q.js promises to work with multiple asynchronous operations

Note: This question is also cross-posted in Q.js mailing list over here . i had a situation with multiple asynchronous operations and the answer I accepted pointed out that using Promises using a library such as q.js would be more beneficial. I am convinced to refactor my code to use Promises but because the code is pretty long, i have trimmed the irrelevant portions and exported the crucial parts into a separate repo. The repo is here and the most important file is this . The requirement is that I want pageSizes to be non-empty after traversing all the dragged'n dropped files. The problem is

Google appengine: Task queue performance

阅读更多关于 Google appengine: Task queue performance

I currently have an application running on appengine and I am executing a few jobs using the deferred library, some of these tasks run daily, while some of them are executed once a month. Most of these tasks query Datastore to retrieve documents and then store the entities in an index (Search API). Some of these tables are replaced monthly and I have to run these tasks on all entities (4~5M). One exemple of such a task is: def addCompaniesToIndex(cursor=None, n_entities=0, mindate=None): #get index BATCH_SIZE = 200 cps, next_cursor, more = Company.query().\ fetch_page(BATCH_SIZE, start_cursor

Google App Engine: task_retry_limit doesn't work?

阅读更多关于 Google App Engine: task_retry_limit doesn't work?

I have a Python GAE app. I want my tasks to stop running or just retry once if they fail. Right now, they run forever despite what my yaml file is telling them! Here is a queue.yaml entry: - name: globalPurchase rate: 10/s bucket_size: 100 retry_parameters: task_retry_limit: 1 If globalPurchase task fails with a 500 error code, it is retried forever until it succeeds with this message in the logs: "Task named "task14" on queue "globalPurchase" failed with code 500; will retry in 30 seconds" Why is task_retry_limit not actually being used? Travis I had the same problem. The documentation and

Parallel processing in PHP - How do you do it?

阅读更多关于 Parallel processing in PHP - How do you do it?

I am currently trying to implement a job queue in php. The queue will then be processed as a batch job and should be able to process some jobs in parallel. I already did some research and found several ways to implement it, but I am not really aware of their advantages and disadvantages. E.g. doing the parallel processing by calling a script several times through fsockopen like explained here: Easy parallel processing in PHP Another way I found was using the curl_multi functions. curl_multi_exec PHP docs But I think those 2 ways will add pretty much overhead for creating batch processing on a

Google appengine: Task queue performance

阅读更多关于 Google appengine: Task queue performance

问题 I currently have an application running on appengine and I am executing a few jobs using the deferred library, some of these tasks run daily, while some of them are executed once a month. Most of these tasks query Datastore to retrieve documents and then store the entities in an index (Search API). Some of these tables are replaced monthly and I have to run these tasks on all entities (4~5M). One exemple of such a task is: def addCompaniesToIndex(cursor=None, n_entities=0, mindate=None): #get

Is there a performance difference between pooling connections or channels in rabbitmq?

阅读更多关于 Is there a performance difference between pooling connections or channels in rabbitmq?

I'm a newbie with Rabbitmq(and programming) so sorry in advance if this is obvious. I am creating a pool to share between threads that are working on a queue but I'm not sure if I should use connections or channels in the pool. I know I need channels to do the actual work but is there a performance benefit of having one channel per connection(in terms of more throughput from the queue)? or am I better off just using a single connection per application and pool many channels? note: because I'm pooling the resources the initial cost is not a factor, as I know connections are more expensive than