Querying Postgres 9.3 by json field is really great. However i couldn\'t find a formal way to update the json object, for which i use an internal function written in plp
No eval
is required. Your issue is that you're not decoding the value as a json object.
CREATE OR REPLACE FUNCTION json_update(data json, key text, value json)
RETURNS json AS
$BODY$
from json import loads, dumps
if key is None: return data
js = loads(data)
# you must decode 'value' with loads too:
js[key] = loads(value)
return dumps(js)
$BODY$
LANGUAGE plpythonu VOLATILE;
postgres=# SELECT json_update('{"a":1}', 'a', '{"innerkey":"innervalue"}');
json_update
-----------------------------------
{"a": {"innerkey": "innervalue"}}
(1 row)
Not only that, but using eval
to decode json
is dangerous and unreliable. It's unreliable because json
isn't Python, it just happens to evaluate a little bit like it much of the time. It's unsafe because you never know what you might be eval'ing. In this case you are largely protected by PostgreSQL's json parser:
postgres=# SELECT json_update(
postgres(# '{"a":1}',
postgres(# 'a',
postgres(# '__import__(''shutil'').rmtree(''/glad_this_is_not_just_root'')'
postgres(# );
ERROR: invalid input syntax for type json
LINE 4: '__import__(''shutil'').rmtree(''/glad_this_is_not_...
^
DETAIL: Token "__import__" is invalid.
CONTEXT: JSON data, line 1: __import__...
... but I won't be at all surprised if someone can slip an eval
exploit past that. So the lesson here: don't use eval
.
Solved
The problem with the above plpythonu function is that it relates to "value" as a string no matter if it's actually a complex json object. The key to solve it is to add eval() around value:
js[key] = eval(value)
That way the json string (named 'value' in this example) looses it's outer enclosing double quotes "{...}" and become an object.
for people who want plv8 (trusted language usable on services like Heroku). I often need to do migrations or updates to json blobs and running a query directly on the db is much faster than downloading all the data, transforming it and then posting an update.
CREATE EXTENSION plv8;
CREATE OR REPLACE FUNCTION json_replace_string(obj json, path text, value text, force boolean)
RETURNS json AS $$
if (value === null && !force) {
return obj;
}
var nestedRe = /(\.|\[)/;
var scrub = /]/g;
path = path.replace(scrub, '');
var pathBits = path.split(nestedRe);
var len = pathBits.length;
var layer = obj;
for (var i = 0; i < len; i += 2) {
if (layer === null || layer === undefined) return obj;
var key = pathBits[i];
if (key === '') continue;
if (i === len - 1) {
layer[key] = value;
} else {
if (force && typeof layer[key] === 'undefined') {
layer[key] = pathBits[i+1] === '.' ? {} : [];
}
layer = layer[key];
}
}
return obj;
$$ LANGUAGE plv8 IMMUTABLE;
You can use this like so
UPDATE my_table
SET blob=json_replace_string(blob, 'some.nested.path[5].to.object', 'new value', false)
WHERE some_condition;
the force
parameter serves two functions - (1) lets you set a null
value. If you are dynamically generating the value based on other columns that don't exist - e.g. blob->'non_existent_value'
then null will be input into the function and you probably don't mean to set the value to null. The (2) purpose is to force the creation of the nested path if it doesn't already exist in the json object you are mutating. e.g
json_replace(string('{"some_key": "some_val"}', 'other_key', 'new_val', true)
gives
{"some_key": "some_val", "other_key": "new_val"}
You can imagine similar functions to update numeric, delete keys etc. This basically enables mongo like functionality inside postgres during the early stages of new features for quick prototyping and as our schema stabilizes we break things out to independent columns and tables to get the best performance.