How to fix Expected start-union. Got VALUE_NUMBER_INT when converting JSON to Avro on the command line?

后端 未结 4 2053
感动是毒
感动是毒 2020-11-27 17:01

I\'m trying to validate a JSON file using an Avro schema and write the corresponding Avro file. First, I\'ve defined the following Avro schema named user.avsc:<

相关标签:
4条回答
  • 2020-11-27 17:39

    I have implemented union and its validation , just create a union schema and pass its values through postman . resgistry url is the url which you specify for properties of kafka , u also can pass dynamic values to your schema

    RestTemplate template = new RestTemplate();
            HttpHeaders headers = new HttpHeaders();
            headers.setContentType(MediaType.APPLICATION_JSON);
            HttpEntity<String> entity = new HttpEntity<String>(headers);
            ResponseEntity<String> response = template.exchange(""+registryUrl+"/subjects/"+topic+"/versions/"+version+"", HttpMethod.GET, entity, String.class);
            String responseData = response.getBody();
            JSONObject jsonObject = new JSONObject(responseData);
            JSONObject jsonObjectResult = new JSONObject(jsonResult);
            String getData = jsonObject.get("schema").toString();
            Schema.Parser parser = new Schema.Parser();
            Schema schema = parser.parse(getData);
            GenericRecord genericRecord = new GenericData.Record(schema);
            schema.getFields().stream().forEach(field->{
                genericRecord.put(field.name(),jsonObjectResult.get(field.name()));
            });
            GenericDatumReader<GenericRecord>reader = new GenericDatumReader<GenericRecord>(schema);
            boolean data = reader.getData().validate(schema,genericRecord );
    
    0 讨论(0)
  • 2020-11-27 17:43

    According to the explanation by Doug Cutting,

    Avro's JSON encoding requires that non-null union values be tagged with their intended type. This is because unions like ["bytes","string"] and ["int","long"] are ambiguous in JSON, the first are both encoded as JSON strings, while the second are both encoded as JSON numbers.

    http://avro.apache.org/docs/current/spec.html#json_encoding

    Thus your record must be encoded as:

    {"name": "Alyssa", "favorite_number": {"int": 7}, "favorite_color": null}
    
    0 讨论(0)
  • 2020-11-27 17:43

    There is a new JSON encoder in the works that should address this common issue:

    https://issues.apache.org/jira/browse/AVRO-1582

    https://github.com/zolyfarkas/avro

    0 讨论(0)
  • 2020-11-27 17:47

    As @Emre-Sevinc has pointed out, the issue is with the encoding of your Avro record.

    To be more specific here;

    Don't do this:

       jsonRecord = avroGenericRecord.toString
    

    Instead, do this:

        val writer = new GenericDatumWriter[GenericRecord](avroSchema)
        val baos = new ByteArrayOutputStream
        val jsonEncoder = EncoderFactory.get.jsonEncoder(avroSchema, baos)
        writer.write(avroGenericRecord, jsonEncoder)
        jsonEncoder.flush
    
        val jsonRecord = baos.toString("UTF-8")
    

    You'll also need following imports:

    import org.apache.avro.Schema
    import org.apache.avro.generic.{GenericData, GenericDatumReader, GenericDatumWriter, GenericRecord}
    import org.apache.avro.io.{DecoderFactory, EncoderFactory}
    

    After you do this, you'll get jsonRecord with non-null union values tagged with their intended type.

    Hope this helps !

    0 讨论(0)
提交回复
热议问题