问题
I am reading in a BigQuery table from Dataflow where one of the fields is a "record" and "repeated" field. So I expected the resulting data type in Java to be List<TableRow>
.
However when I try to iterate over the list I get the following exception:
java.lang.ClassCastException: java.util.LinkedHashMap cannot be cast to com.google.api.services.bigquery.model.TableRow
The table schema looks something like this:
{
"id": "my_id",
"values": [
{
"nested_record": "nested"
}
]
}
The code to iterate over values looks something like this:
String id = (String) row.get("id");
List<TableRow> values = (List<TableRow>) row.get("values");
for (TableRow nested : values) {
// more logic
}
The exception is thrown right where the loop begins.
The obvious fix here is to just cast values as a List of LinkedHashMaps
but that doesn't feel right.
Why does Dataflow throw this kind of error for nested "records"?
回答1:
I faced the same ClassCastException
when I try to use google cloud DataFlow to read Nested tables from BigQuery. And finally solved by casting TableRow
to different data structure depends on which DataFlow runner I use:
- if use
DirectRunner
: cast intoLinkedHashMap
- if use
DataflowRunner
: cast intoTableRow
.
example:
Object valuesList = row.get("values");
// DirectRunner
for (TableRow v : (List<LinkedHashMap>) valuesList) {
String name = v.get("name");
String age = v.get("age");
}
// DataflowRunner
for (TableRow v : (List<TableRow>) valuesList) {
String name = v.get("name");
String age = v.get("age");
}
回答2:
Have a look at BEAM-2767
The underlying cause of this is due to the encoding round trip performed by the DirectRunner between steps, which is not usually performed in Dataflow. Accessing the repeated record (or any record) as a Map field will execute successfully on both of these runners, as a TableRow implements the Map interface. Records are read as type "TableRow", but when they are encoded they are encoded as a simple JSON map. Because the JSON coder does not recognize the types of the fields of the map, it deserializes the record as a simple map type.
TableRow is a Map so you can treat both cases as Map:
String id = (String) row.get("id");
List<? extends Map> values = row.get("values");
for (Map nested : values) {
// more logic
}
来源:https://stackoverflow.com/questions/52084143/classcastexception-when-reading-nested-list-of-records