How can I fetch more than 1000 record from data store and put all in one single list to pass to django?
Starting with Version 1.3.6 (released Aug-17-2010) you CAN
From the changelog:
Results of datastore count() queries and offsets for all datastore queries are no longer capped at 1000.
App Engine gives you a nice way of "paging" through the results by 1000 by ordering on Keys and using the last key as the next offset. They even provide some sample code here:
http://code.google.com/appengine/docs/python/datastore/queriesandindexes.html#Queries_on_Keys
Although their example spreads the queries out over many requests, you can change the page size from 20 to 1000 and query in a loop, combining the querysets. Additionally you might use itertools to link the queries without evaluating them before they're needed.
For example, to count how many rows beyond 1000:
class MyModel(db.Expando):
@classmethod
def count_all(cls):
"""
Count *all* of the rows (without maxing out at 1000)
"""
count = 0
query = cls.all().order('__key__')
while count % 1000 == 0:
current_count = query.count()
if current_count == 0:
break
count += current_count
if current_count == 1000:
last_key = query.fetch(1, 999)[0].key()
query = query.filter('__key__ > ', last_key)
return count
JJG: your solution above is awesome, except that it causes an infinite loop if you have 0 records. (I found this out while testing some of my reports locally).
I modified the start of the while loop to look like this:
while count % 1000 == 0:
current_count = query.count()
if current_count == 0:
break
If you're using NDB:
@staticmethod
def _iterate_table(table, chunk_size=200):
offset = 0
while True:
results = table.query().order(table.key).fetch(chunk_size + 1, offset=offset)
if not results:
break
for result in results[:chunk_size]:
yield result
if len(results) < chunk_size + 1:
break
offset += chunk_size
class Count(object):
def getCount(self,cls):
class Count(object):
def getCount(self,cls):
"""
Count *all* of the rows (without maxing out at 1000)
"""
count = 0
query = cls.all().order('__key__')
while 1:
current_count = query.count()
count += current_count
if current_count == 0:
break
last_key = query.fetch(1, current_count-1)[0].key()
query = query.filter('__key__ > ', last_key)
return count
we are using something in our ModelBase
class that is:
@classmethod
def get_all(cls):
q = cls.all()
holder = q.fetch(1000)
result = holder
while len(holder) == 1000:
holder = q.with_cursor(q.cursor()).fetch(1000)
result += holder
return result
This gets around the 1000 query limit on every model without having to think about it. I suppose a keys version would be just as easy to implement.