How to delete all datastore in Google App Engine?

后端 未结 29 1416
夕颜
夕颜 2020-11-28 01:17

Does anyone know how to delete all datastore in Google App Engine?

相关标签:
29条回答
  • 2020-11-28 01:56

    The fastest and efficient way to handle bulk delete on Datastore is by using the new mapper API announced on the latest Google I/O.

    If your language of choice is Python, you just have to register your mapper in a mapreduce.yaml file and define a function like this:

    from mapreduce import operation as op
    def process(entity):
     yield op.db.Delete(entity)
    

    On Java you should have a look to this article that suggests a function like this:

    @Override
    public void map(Key key, Entity value, Context context) {
        log.info("Adding key to deletion pool: " + key);
        DatastoreMutationPool mutationPool = this.getAppEngineContext(context)
                .getMutationPool();
        mutationPool.delete(value.getKey());
    }
    

    EDIT:
    Since SDK 1.3.8, there's a Datastore admin feature for this purpose

    0 讨论(0)
  • 2020-11-28 01:57

    If you're using ndb, the method that worked for me for clearing the datastore:

    ndb.delete_multi(ndb.Query(default_options=ndb.QueryOptions(keys_only=True)))
    
    0 讨论(0)
  • 2020-11-28 01:57
    • continuing the idea of svpino it is wisdom to reuse records marked as delete. (his idea was not to remove, but mark as "deleted" unused records). little bit of cache/memcache to handle working copy and write only difference of states (before and after desired task) to datastore will make it better. for big tasks it is possible to write itermediate difference chunks to datastore to avoid data loss if memcache disappeared. to make it loss-proof it is possible to check integrity/existence of memcached results and restart task (or required part) to repeat missing computations. when data difference is written to datastore, required computations are discarded in queue.

    • other idea similar to map reduced is to shard entity kind to several different entity kinds, so it will be collected together and visible as single entity kind to final user. entries are only marked as "deleted". when "deleted" entries amount per shard overcomes some limit, "alive" entries are distributed between other shards, and this shard is closed forever and then deleted manually from dev console (guess at less cost) upd: seems no drop table at console, only delete record-by-record at regular price.

    • it is possible to delete by query by chunks large set of records without gae failing (at least works locally) with possibility to continue in next attempt when time is over:

    
        qdelete.getFetchPlan().setFetchSize(100);
    
        while (true)
        {
            long result = qdelete.deletePersistentAll(candidates);
            LOG.log(Level.INFO, String.format("deleted: %d", result));
            if (result <= 0)
                break;
        }
    
    • also sometimes it useful to make additional field in primary table instead of putting candidates (related records) into separate table. and yes, field may be unindexed/serialized array with little computation cost.
    0 讨论(0)
  • 2020-11-28 02:00

    I was so frustrated about existing solutions for deleting all data in the live datastore that I created a small GAE app that can delete quite some amount of data within its 30 seconds.

    How to install etc: https://github.com/xamde/xydra

    0 讨论(0)
  • 2020-11-28 02:00

    You have 2 simple ways,

    #1: To save cost, delete the entire project

    #2: using ts-datastore-orm:

    https://www.npmjs.com/package/ts-datastore-orm await Entity.truncate(); The truncate can delete around 1K rows per seconds

    0 讨论(0)
提交回复
热议问题