I'm testing the Athena product of AWS, so far is working very good. But I want to know the list of SerDe properties. I've searched far and wide and couldn't find it. I'm using this one for example "ignore.malformed.json" = "true"
, but I'm pretty sure there are a ton of other options to tune the queries.
I couldn't find info for example, on what the "path" property does, so having the full list will be amazing.
I have looked at Apache Hive docs but couldn't find this, and neither on AWS docs/forums.
Thanks!
It seems you are using the Openx-JsonSerDe
http://docs.aws.amazon.com/athena/latest/ug/json.html
// properties used in configuration
public static final String PROP_IGNORE_MALFORMED_JSON = "ignore.malformed.json";
public static final String PROP_DOTS_IN_KEYS = "dots.in.keys";
public static final String PROP_CASE_INSENSITIVE ="case.insensitive" ;
As stated in release notes (see bullet #2 please), the JSON OpenX SerDe used in Athena has been improved. The improvements include, but are not limited to, the following:
- Support for the ConvertDotsInJsonKeysToUnderscores property. When set to TRUE, it allows the SerDe to replace the dots in key names with underscores. For example, if the JSON dataset contains a key with the name "a.b", you can use this property to define the column name to be "a_b" in Athena. The default is FALSE. By default, Athena does not allow dots in column names.
- Support for the case.insensitive property. By default, Athena requires that all keys in your JSON dataset use lowercase. Using WITH SERDE PROPERTIES ("case.insensitive"= FALSE;) allows you to use case-sensitive key names in your data. The default is TRUE. When set to TRUE, the SerDe converts all uppercase columns to lowercase.
For more information, see OpenX JSON SerDe in the Amazon Athena User Guide.
来源:https://stackoverflow.com/questions/44118660/serde-properties-list-for-aws-athena-json