How can I fetch more than 1000 record from data store and put all in one single list to pass to django?
This 1K limit issue is resolved.
query = MyModel.all()
for doc in query:
print doc.title
By treating the Query object as an iterable: The iterator retrieves results from the datastore in small batches, allowing for the app to stop iterating on results to avoid fetching more than is needed. Iteration stops when all of the results that match the query have been retrieved. As with fetch(), the iterator interface does not cache results, so creating a new iterator from the Query object will re-execute the query.
The max batch size is 1K. And you still have the auto Datastore quotas as well.
But with the plan 1.3.1 SDK, they've introduced cursors that can be serialized and saved so that a future invocation can begin the query where it last left off at.
To add the contents of the two queries together:
list1 = first query
list2 = second query
list1 += list2
List 1 now contains all 2000 results.
Fetching though the remote api still has issues when more than 1000 records. We wrote this tiny function to iterate over a table in chunks:
def _iterate_table(table, chunk_size = 200):
offset = 0
while True:
results = table.all().order('__key__').fetch(chunk_size+1, offset = offset)
if not results:
break
for result in results[:chunk_size]:
yield result
if len(results) < chunk_size+1:
break
offset += chunk_size
The proposed solution only works if entries are sorted by key... If you are sorting by another column first, you still have to use a limit(offset, count) clause, then the 1000 entries limitation still apply. It is the same if you use two requests : one for retrieving indexes (with conditions and sort) and another using where index in () with a subset of indexes from the first result, as the first request cannot return more than 1000 keys ? (The Google Queries on Keys section does not state clearly if we have to sort by key to remove the 1000 results limitation)