使用kafka作为数据源,给storm的后续bolt提供数据的例子,还有 为啥要input.getString(4),这个4怎么来的。
先看main方法,步骤很清新的。
package com.lxk.storm;
import com.lxk.storm.bolt.OutInfoBolt;
import org.apache.kafka.common.utils.Utils;
import org.apache.storm.Config;
import org.apache.storm.LocalCluster;
import org.apache.storm.kafka.spout.KafkaSpout;
import org.apache.storm.kafka.spout.KafkaSpoutConfig;
import org.apache.storm.topology.TopologyBuilder;
import org.junit.Test;
/**
* 就测试一下从kafka消费数据,然后在bolt里面可以看到数据。
*
* @author LiXuekai on 2020/10/13
*/
public class TestKafkaSpout {
private static final String TOPOLOGY_NAME = "kafka-spout-topology";
@Test
public void test() {
//1,创建了一个TopologyBuilder实例
TopologyBuilder topologyBuilder = new TopologyBuilder();
//1,实例化 kafka spout
kafkaSpout(topologyBuilder);
//3,bolt 消费数据
outInfoBolt(topologyBuilder);
//4,提交
submitTopology(topologyBuilder);
}
private void submitTopology(TopologyBuilder builder) {
//Config类是一个HashMap<String,Object>的子类,用来配置topology运行时的行为
Config config = new Config();
//设置worker数量
//config.setNumWorkers(2);
LocalCluster cluster = new LocalCluster();
//本地提交
cluster.submitTopology(TOPOLOGY_NAME, config, builder.createTopology());
Utils.sleep(10000);
cluster.killTopology(TOPOLOGY_NAME);
cluster.shutdown();
}
private void outInfoBolt(TopologyBuilder topologyBuilder) {
OutInfoBolt outInfoBolt = new OutInfoBolt();
topologyBuilder.setBolt("out-info-bolt-id", outInfoBolt, 1).setNumTasks(1).shuffleGrouping("kafka-info-spout-id");
}
private void kafkaSpout(TopologyBuilder topologyBuilder) {
KafkaSpout<String, String> kafkaSpout = new KafkaSpout<>(initKafkaSpoutConfig());
topologyBuilder.setSpout("kafka-info-spout-id", kafkaSpout, 1);
}
private KafkaSpoutConfig<String, String> initKafkaSpoutConfig() {
KafkaSpoutConfig.Builder<String, String> builder = KafkaSpoutConfig.builder("192.168.1.191:9092", "a_citic_test_lxk");
builder.setProp("group.id", "KAFKA_STORM");
return builder.build();
}
}
注意的是初始化spout到topology的时候,使用的ID在后面设置bolt的时候要一致。然后是bolt的代码,就简单干一件事事儿,打印。
package com.lxk.storm.bolt;
import org.apache.storm.task.OutputCollector;
import org.apache.storm.task.TopologyContext;
import org.apache.storm.topology.OutputFieldsDeclarer;
import org.apache.storm.topology.base.BaseRichBolt;
import org.apache.storm.tuple.Tuple;
import java.util.Map;
/**
* 接收传的信息,打印
*
* @author LiXuekai on 2020/10/13
*/
public class OutInfoBolt extends BaseRichBolt {
@Override
public void prepare(Map stormConf, TopologyContext context, OutputCollector collector) {
}
@Override
public void execute(Tuple input) {
try {
// [topic, partition, offset, key, value]
String string = input.getString(4);
System.out.println(string);
} catch (Exception e) {
System.out.println(e.getMessage());
}
}
@Override
public void declareOutputFields(OutputFieldsDeclarer declarer) {
}
}
然后再放个debug的截图,就更能说明这个地方为啥要input.getString(4),这个4是怎么来的了。
getString方法是直接把值给弄成字符串了,还有返回obj呢
刚刚学的时候,若是知道这个返回obj类型的,然后再类型转换成你要的类型,数据在bolt之间传递,就方便多了。看图里面的几个方法名称,差不多能猜到返回的都是啥吧。
代码放在github上了,地址:https://github.com/cmshome/JavaNote/tree/master/storm/src/main/java/com/lxk/storm捡自己需要的拿下来测试就OK了。
来源:oschina
链接:https://my.oschina.net/u/4385242/blog/4672629