Updating json field in Postgres

前端 未结 3 1928
眼角桃花
眼角桃花 2021-02-20 05:53

Querying Postgres 9.3 by json field is really great. However i couldn\'t find a formal way to update the json object, for which i use an internal function written in plp

相关标签:
3条回答
  • 2021-02-20 06:27

    No eval is required. Your issue is that you're not decoding the value as a json object.

    CREATE OR REPLACE FUNCTION json_update(data json, key text, value json)
    RETURNS json AS
    $BODY$
       from json import loads, dumps
       if key is None: return data
       js = loads(data)
       # you must decode 'value' with loads too:
       js[key] = loads(value)
       return dumps(js)
    $BODY$
    LANGUAGE plpythonu VOLATILE;
    
    postgres=# SELECT json_update('{"a":1}', 'a', '{"innerkey":"innervalue"}');
                json_update            
    -----------------------------------
     {"a": {"innerkey": "innervalue"}}
    (1 row)
    

    Not only that, but using eval to decode json is dangerous and unreliable. It's unreliable because json isn't Python, it just happens to evaluate a little bit like it much of the time. It's unsafe because you never know what you might be eval'ing. In this case you are largely protected by PostgreSQL's json parser:

    postgres=# SELECT json_update(
    postgres(#    '{"a":1}', 
    postgres(#    'a', 
    postgres(#    '__import__(''shutil'').rmtree(''/glad_this_is_not_just_root'')'
    postgres(# );
    ERROR:  invalid input syntax for type json
    LINE 4:          '__import__(''shutil'').rmtree(''/glad_this_is_not_...
                     ^
    DETAIL:  Token "__import__" is invalid.
    CONTEXT:  JSON data, line 1: __import__...
    

    ... but I won't be at all surprised if someone can slip an eval exploit past that. So the lesson here: don't use eval.

    0 讨论(0)
  • 2021-02-20 06:32

    Solved

    The problem with the above plpythonu function is that it relates to "value" as a string no matter if it's actually a complex json object. The key to solve it is to add eval() around value:

    js[key] = eval(value)
    

    That way the json string (named 'value' in this example) looses it's outer enclosing double quotes "{...}" and become an object.

    0 讨论(0)
  • 2021-02-20 06:43

    for people who want plv8 (trusted language usable on services like Heroku). I often need to do migrations or updates to json blobs and running a query directly on the db is much faster than downloading all the data, transforming it and then posting an update.

    CREATE EXTENSION plv8;
    CREATE OR REPLACE FUNCTION json_replace_string(obj json, path text, value text, force boolean)
    RETURNS json AS $$
    if (value === null && !force) {
      return obj;
    }
    var nestedRe = /(\.|\[)/;
    var scrub = /]/g;
    path = path.replace(scrub, '');
    var pathBits = path.split(nestedRe);
    var len = pathBits.length;
    var layer = obj;
    for (var i = 0; i < len; i += 2) {
      if (layer === null || layer === undefined) return obj;
      var key = pathBits[i];
      if (key === '') continue;
      if (i === len - 1) {
        layer[key] = value;
      } else {
        if (force && typeof layer[key] === 'undefined') {
          layer[key] = pathBits[i+1] === '.' ? {} : [];
        }
        layer = layer[key];
      }
    }
    return obj;
    $$ LANGUAGE plv8 IMMUTABLE;
    

    You can use this like so

    UPDATE my_table
    SET blob=json_replace_string(blob, 'some.nested.path[5].to.object', 'new value', false)
    WHERE some_condition;
    

    the force parameter serves two functions - (1) lets you set a null value. If you are dynamically generating the value based on other columns that don't exist - e.g. blob->'non_existent_value' then null will be input into the function and you probably don't mean to set the value to null. The (2) purpose is to force the creation of the nested path if it doesn't already exist in the json object you are mutating. e.g

    json_replace(string('{"some_key": "some_val"}', 'other_key', 'new_val', true)
    

    gives

    {"some_key": "some_val", "other_key": "new_val"}
    

    You can imagine similar functions to update numeric, delete keys etc. This basically enables mongo like functionality inside postgres during the early stages of new features for quick prototyping and as our schema stabilizes we break things out to independent columns and tables to get the best performance.

    0 讨论(0)
提交回复
热议问题