is JSONDeserializationSchema() deprecated in Flink?

前端 未结 2 479
一向
一向 2021-01-16 02:15

I am new to Flink and doing something very similar to the below link.

Cannot see message while sinking kafka stream and cannot see print message in flink 1.2

相关标签:
2条回答
  • 2021-01-16 03:05

    JSONDeserializationSchema was removed in Flink 1.8, after having been deprecated earlier.

    The recommended approach is to write a deserializer that implements DeserializationSchema<T>. Here's an example, which I've copied from the Flink Operations Playground:

    import org.apache.flink.api.common.serialization.DeserializationSchema;
    import org.apache.flink.api.common.typeinfo.TypeInformation;
    
    import org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.ObjectMapper;
    
    import java.io.IOException;
    
    /**
     * A Kafka {@link DeserializationSchema} to deserialize {@link ClickEvent}s from JSON.
     *
     */
    public class ClickEventDeserializationSchema implements DeserializationSchema<ClickEvent> {
    
        private static final long serialVersionUID = 1L;
    
        private static final ObjectMapper objectMapper = new ObjectMapper();
    
        @Override
        public ClickEvent deserialize(byte[] message) throws IOException {
            return objectMapper.readValue(message, ClickEvent.class);
        }
    
        @Override
        public boolean isEndOfStream(ClickEvent nextElement) {
            return false;
        }
    
        @Override
        public TypeInformation<ClickEvent> getProducedType() {
            return TypeInformation.of(ClickEvent.class);
        }
    }
    

    For a Kafka producer you'll want to implement KafkaSerializationSchema<T>, and you'll find examples of that in that same project.

    0 讨论(0)
  • 2021-01-16 03:21

    To solve the problem of reading non-key JSON messages from Kafka I used case class and JSON parser.

    The following code makes a case class and parses the JSON field using play API.

    import play.api.libs.json.JsValue
    
    object CustomerModel {
    
      def readElement(jsonElement: JsValue): Customer = {
        val id = (jsonElement \ "id").get.toString().toInt
        val name = (jsonElement \ "name").get.toString()
        Customer(id,name)
      }
    case class Customer(id: Int, name: String)
    }
    
    def main(args: Array[String]): Unit = {
    val env = StreamExecutionEnvironment.getExecutionEnvironment
    val properties = new Properties()
    properties.setProperty("bootstrap.servers", "xxx.xxx.0.114:9092")
    properties.setProperty("group.id", "test-grp")
    
    val consumer = new FlinkKafkaConsumer[String]("customer", new SimpleStringSchema(), properties)
    val stream1 = env.addSource(consumer).rebalance
    
    val stream2:DataStream[Customer]= stream1.map( str =>{Try(CustomerModel.readElement(Json.parse(str))).getOrElse(Customer(0,Try(CustomerModel.readElement(Json.parse(str))).toString))
        })
    
    stream2.print("stream2")
    env.execute("This is Kafka+Flink")
    
    }
    

    The Try method lets you overcome the exception thrown while parsing the data and returns the exception in one of the fields (if we want) or else it can just return the case class object with any given or default fields.

    The sample output of the Code is:

    stream2:1> Customer(1,"Thanh")
    stream2:1> Customer(5,"Huy")
    stream2:3> Customer(0,Failure(com.fasterxml.jackson.databind.JsonMappingException: No content to map due to end-of-input
     at [Source: ; line: 1, column: 0]))
    

    I am not sure if it is the best approach but it is working for me as of now.

    0 讨论(0)
提交回复
热议问题