Accessing nested fields in AVRO GenericRecord (Java/Scala)

社会主义新天地 提交于 2021-02-07 09:17:58

问题


I have a GenericRecord with nested fields. When I use genericRecord.get(1) it returns an Object that contains the nested AVRO data.

I want to be able to access that object like genericRecord.get(1).get(0), but I can't because AVRO returns an Object.

Is there an easy way around this?

When I do something like returnedObject.get("item") it says item not a member of returnedObject.


回答1:


I figured out one way to do it. Cast the returned Object as a GenericRecord.

Example (scala):

val data_nestedObj = (data.get("nestedObj")).asInstanceOf[GenericRecord]

Then I can access a nested field within that new GenericRecord by doing:

data_nestedObj.get("nestedField")

This works well enough for me.




回答2:


You could use an avro serialization library to help you. For example https://github.com/sksamuel/avro4s (I am the author) but there are others.

You just need to define a case class for the type of data you are getting, and this can include nested case classes. For example,

case class Boo(d: Boolean)
case class Foo(a: String, b: Int, c: Boo)

Then you create an instance of the RecordFormat typeclass.

val format = RecordFormat[Foo]

Then finally, you can use that to extract records or create records.

val record = format.to(someFoo)

or

val foo = format.from(someRecord)



回答3:


@rye's answer is correct and works fine, but if you can avoid the use of asInstanceOf then you should. So I wrote the following method to retrieve nested fields.

  /**
    * Get the value of the provided property. If the property contains `.` it assumes the property is nested and
    * parses the avroRecord with respective number of nested levels and retrieves the value at that level.
    */
  def getNestedProperty(property: String, avroRecord: GenericRecord): Option[Object] = {
    val tokens = property.split("\\.")

    tokens.foldLeft[Tuple2[GenericRecord, Option[Object]]]((avroRecord, None)) {(tuple, token) =>
      tuple._1.get(token) match {
        case value: GenericRecord =>
          (value, tuple._2)
        case value @ (_:CharSequence | _:Number) =>
          (tuple._1, Option(value))
        case _ =>
          (tuple._1, None)
      }
    }._2
  }


来源:https://stackoverflow.com/questions/35729253/accessing-nested-fields-in-avro-genericrecord-java-scala

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!