Transferring data from product datastore to local development environment datastore in Google App Engine (Python)

后端 未结 1 1238
南方客
南方客 2021-01-15 09:13

TL;DR I need to find a real solution to download my data from product datastore and load it to the local development environment.

The detailed problem:

I nee

1条回答
  •  -上瘾入骨i
    2021-01-15 09:30

    It sounds like you should be using a remote sandbox

    Even if you get this to work, the localhost datastore still behaves differently than the actual datastore.

    If you want to truly simulate your production environment, then i would recommend setting up a clone of your app engine project as a remote sandbox. You could deploy your app to a new gae project id appcfg.py update . -A sandbox-id, and use datastore admin to create a backup of production in google cloud storage and then use datastore admin in your sandbox to restore this backup in your sandbox.

    Cloning production data into localhost

    I do prime my localhost datastore with some production data, but this is not a complete clone. Just the core required objects and a few test users.

    To do this I wrote a google dataflow job that exports select models and saves them in google cloud storage in jsonl format. Then on my local host I have an endpoint called /init/ which launches a taskqueue job to download these exports and import them.

    To do this i reuse my JSON REST handler code which is able to convert any model to json and vice versa.

    In theory you could do this for your entire datastore.

    EDIT - This is what my to-json/from-json code looks like:

    All of my ndb.Models subclass my BaseModel which has generic conversion code:

    get_dto_typemap = {
        ndb.DateTimeProperty: dt_to_timestamp,
        ndb.KeyProperty: key_to_dto,
        ndb.StringProperty: str_to_dto,
        ndb.EnumProperty: str,
    }
    set_from_dto_typemap = {
        ndb.DateTimeProperty: timestamp_to_dt,
        ndb.KeyProperty: dto_to_key,
        ndb.FloatProperty: float_from_dto,
        ndb.StringProperty: strip,
        ndb.BlobProperty: str,
        ndb.IntegerProperty: int,
    }
    
    class BaseModel(ndb.Model):
    
        def to_dto(self):
            dto = {'key': key_to_dto(self.key)}
            for name, obj in self._properties.iteritems():
                key = obj._name
                value = getattr(self, obj._name)
                if obj.__class__ in get_dto_typemap:
                    if obj._repeated:
                        value = [get_dto_typemap[obj.__class__](v) for v in value]
                    else:
                        value = get_dto_typemap[obj.__class__](value)
                dto[key] = value
            return dto
    
        def set_from_dto(self, dto):
            for name, obj in self._properties.iteritems():
                if isinstance(obj, ndb.ComputedProperty):
                    continue
                key = obj._name
                if key in dto:
                    value = dto[key]
                    if not obj._repeated and obj.__class__ in set_from_dto_typemap:
                        try:
                            value = set_from_dto_typemap[obj.__class__](value)
                        except Exception as e:
                            raise Exception("Error setting "+self.__class__.__name__+"."+str(key)+" to '"+str(value) + "': " + e.message)
                    try:
                        setattr(self, obj._name, value)
                    except Exception as e:
                        print dir(obj)
                        raise Exception("Error setting "+self.__class__.__name__+"."+str(key)+" to '"+str(value)+"': "+e.message)
    
    class User(BaseModel):
        # user fields, etc
    

    My request handlers then use set_from_dto & to_dto like this (BaseHandler also provides some convenience methods for converting json payloads to python dicts and what not):

    class RestHandler(BaseHandler):
        MODEL = None
    
        def put(self, resource_id=None):
            if resource_id:
                obj = ndb.Key(self.MODEL, urlsafe=resource_id).get()
                if obj:
                    obj.set_from_dto(self.json_body)
                    obj.put()
                    return obj.to_dto()
                else:
                    self.abort(422, "Unknown id")
            else:
                self.abort(405)
    
        def post(self, resource_id=None):
            if resource_id:
                self.abort(405)
            else:
                obj = self.MODEL()
                obj.set_from_dto(self.json_body)
                obj.put()
                return obj.to_dto()
    
        def get(self, resource_id=None):
            if resource_id:
                obj = ndb.Key(self.MODEL, urlsafe=resource_id).get()
                if obj:
                    return obj.to_dto()
                else:
                    self.abort(422, "Unknown id")
            else:
                cursor_key = self.request.GET.pop('$cursor', None)
                limit = max(min(200, self.request.GET.pop('$limit', 200)), 10)
                qs = self.MODEL.query()
                # ... other code that handles query params
                results, next_cursor, more = qs.fetch_page(limit, start_cursor=cursor)
                return {
                    '$cursor': next_cursor.urlsafe() if more else None,
                    'results': [result.to_dto() for result in results],
                }
    
    class UserHandler(RestHandler):
        MODEL = User
    

    0 讨论(0)
提交回复
热议问题