Filtering between topics

非 Y 不嫁゛ 提交于 2019-12-23 04:39:08

问题


I have 1,000 records in a topic. I am trying to filter the records from the input topic to the output topic based on the Salary.

For Example: I want the records of people whose salary is higher than 30,000.
I am trying to use KSTREAMS using Java for this.

The records are in text format(Comma Seperated), example:

first_name, last_name, email, gender, ip_address, country, salary
Redacted,Tranfield,user@example.com,Female,45.25.XXX.XXX,Russia,$12345.01
Redacted,Merck,user@example.com,Male,236.224.XXX.XXX,Belarus,$54321.96
Redacted,Kopisch,user@example.com,Male,61.36.XXX.XXX,Morocco,$12345.05
Redacted,Edds,user@example.com,Male,6.87.XXX.XXX,Poland,$54321.72
Redacted,Alston,user@example.com,Female,56.146.XXX.XXX,Indonesia,$12345.16
...

This is my code:

public class StreamsStartApp {
public static void main(String[] args) {
System.out.println();
Properties config = new Properties();
config.put(StreamsConfig.APPLICATION_ID_CONFIG, "streams-starter-app");
config.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
config.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
config.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG,   Serdes.String().getClass());
config.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, Serdes.String().getClass());

StreamsBuilder builder = new StreamsBuilder();

// Stream from Kafka topic
KStream<Long, Long> newInput = builder.stream("word-count-input");
Stream<Long, Long> usersAndColours = newInput
// step 1 - we ensure that a comma is here as we will split on it
.filter(value -> value.contains(",")
// step 2 - we select a key that will be the user id
.selectKey((key, value) -> value.split(",")[6])

// step 3 - got stuck here. 
// .filter(key -> key.value[6] > 30000 
// .selectKey((new1, value1) -> value1.split)(",")[3])
//  .filter((key, value) -> key.greater(10));
//    .filter((key, value) -> key > 10);
// .filter(key -> key.getkey().intValue() > 10);
usersAndColours.to("new-output");
Runtime.getRuntime().addShutdownHook(new Thread(streams::close))  

Here in this above code near step 1, I have separated the sample data using ','.
In step 2 I have selected one field i.e.: salary field as key.
Now in step 3 I am trying to filter the data using salary field.
I tried some ways which are commented, but nothing worked.
Any ideas will help.


回答1:


First, both your key and value are String serdes, not Longs, so KStream<Long, Long> is not correct.

And value.split(",")[6] is just a String, not a Double. (or a Long, since there's decimal values)

You need to remove the $ from your column and parse the string to a Double, then you can filter on it. Also it's not key.value[6] because your key is not an object with a value field.

And you should probably make the email the key, not the salary, if you even need a key, that is

Realistically, you can do this in one line (made two here for readability)

newInput.filter(value -> value.contains(",")  && 
    Double.parseDouble(value.split(",")[6].replace("$", "")) > 30000);


来源:https://stackoverflow.com/questions/49160458/filtering-between-topics

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!