I have a set of keywords that are passed through via JSON from a DB (encoded UTF-8), some of which may have special characters like é, è, ç, etc. This is used as part of an auto
json_encode
seems to be dropping strings that contain invalid characters. It is likely that your UTF-8 data is not arriving in the proper form from your database.
Looking at the examples you give, my wild guess would be that your database connection is not UTF-8 encoded and serves ISO-8859-1 characters instead.
Can you try a SET NAMES utf8;
after initializing the connection?