I am trying to get some values out of nested JSON for millions of rows (5 TB+ table). What is the most efficient way to do this?
Here is an example:
{\"c
Implementing a SerDe to parse your data in JSON is a better way for your case.
A tutorial on how to implement SerDe for parsing JSON can be found here
http://blog.cloudera.com/blog/2012/12/how-to-use-a-serde-in-apache-hive/
You can use the following sample SerDe implementation as well
https://github.com/rcongiu/Hive-JSON-Serde