ClassCastException when reading nested list of records

五迷三道 提交于 2021-01-29 10:42:57

问题


I am reading in a BigQuery table from Dataflow where one of the fields is a "record" and "repeated" field. So I expected the resulting data type in Java to be List<TableRow>.

However when I try to iterate over the list I get the following exception:

java.lang.ClassCastException: java.util.LinkedHashMap cannot be cast to com.google.api.services.bigquery.model.TableRow

The table schema looks something like this:

{
    "id": "my_id",
    "values": [
        {
            "nested_record": "nested"
        }
    ]
}

The code to iterate over values looks something like this:

String id = (String) row.get("id");
List<TableRow> values = (List<TableRow>) row.get("values");

for (TableRow nested : values) {
    // more  logic
}

The exception is thrown right where the loop begins. The obvious fix here is to just cast values as a List of LinkedHashMaps but that doesn't feel right.

Why does Dataflow throw this kind of error for nested "records"?


回答1:


I faced the same ClassCastException when I try to use google cloud DataFlow to read Nested tables from BigQuery. And finally solved by casting TableRow to different data structure depends on which DataFlow runner I use:

  • if use DirectRunner: cast into LinkedHashMap
  • if use DataflowRunner: cast into TableRow.

example:

Object valuesList = row.get("values");
// DirectRunner
for (TableRow v : (List<LinkedHashMap>) valuesList) {
   String name = v.get("name");
   String age = v.get("age");
}

// DataflowRunner
for (TableRow v : (List<TableRow>) valuesList) {
   String name = v.get("name");
   String age = v.get("age");
}



回答2:


Have a look at BEAM-2767

The underlying cause of this is due to the encoding round trip performed by the DirectRunner between steps, which is not usually performed in Dataflow. Accessing the repeated record (or any record) as a Map field will execute successfully on both of these runners, as a TableRow implements the Map interface. Records are read as type "TableRow", but when they are encoded they are encoded as a simple JSON map. Because the JSON coder does not recognize the types of the fields of the map, it deserializes the record as a simple map type.

TableRow is a Map so you can treat both cases as Map:

    String id = (String) row.get("id");
    List<? extends Map> values = row.get("values");

    for (Map nested : values) {
        // more  logic
    }


来源:https://stackoverflow.com/questions/52084143/classcastexception-when-reading-nested-list-of-records

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!