序
本文主要研究一下storm trident的coordinator
实例
代码示例
@Test public void testDebugTopologyBuild(){ FixedBatchSpout spout = new FixedBatchSpout(new Fields("user", "score"), 3, new Values("nickt1", 4), new Values("nickt2", 7), new Values("nickt3", 8), new Values("nickt4", 9), new Values("nickt5", 7), new Values("nickt6", 11), new Values("nickt7", 5) ); spout.setCycle(false); TridentTopology topology = new TridentTopology(); Stream stream1 = topology.newStream("spout1",spout) .each(new Fields("user", "score"), new BaseFunction() { @Override public void execute(TridentTuple tuple, TridentCollector collector) { System.out.println("tuple:"+tuple); } },new Fields()); topology.build(); }
- 这里使用的spout为FixedBatchSpout,它是IBatchSpout类型
拓扑图
MasterBatchCoordinator
storm-1.2.2/storm-core/src/jvm/org/apache/storm/trident/topology/MasterBatchCoordinator.java
public class MasterBatchCoordinator extends BaseRichSpout { public static final Logger LOG = LoggerFactory.getLogger(MasterBatchCoordinator.class); public static final long INIT_TXID = 1L; public static final String BATCH_STREAM_ID = "$batch"; public static final String COMMIT_STREAM_ID = "$commit"; public static final String SUCCESS_STREAM_ID = "$success"; private static final String CURRENT_TX = "currtx"; private static final String CURRENT_ATTEMPTS = "currattempts"; private List<TransactionalState> _states = new ArrayList(); TreeMap<Long, TransactionStatus> _activeTx = new TreeMap<Long, TransactionStatus>(); TreeMap<Long, Integer> _attemptIds; private SpoutOutputCollector _collector; Long _currTransaction; int _maxTransactionActive; List<ITridentSpout.BatchCoordinator> _coordinators = new ArrayList(); List<String> _managedSpoutIds; List<ITridentSpout> _spouts; WindowedTimeThrottler _throttler; boolean _active = true; public MasterBatchCoordinator(List<String> spoutIds, List<ITridentSpout> spouts) { if(spoutIds.isEmpty()) { throw new IllegalArgumentException("Must manage at least one spout"); } _managedSpoutIds = spoutIds; _spouts = spouts; LOG.debug("Created {}", this); } public List<String> getManagedSpoutIds(){ return _managedSpoutIds; } @Override public void activate() { _active = true; } @Override public void deactivate() { _active = false; } @Override public void open(Map conf, TopologyContext context, SpoutOutputCollector collector) { _throttler = new WindowedTimeThrottler((Number)conf.get(Config.TOPOLOGY_TRIDENT_BATCH_EMIT_INTERVAL_MILLIS), 1); for(String spoutId: _managedSpoutIds) { _states.add(TransactionalState.newCoordinatorState(conf, spoutId)); } _currTransaction = getStoredCurrTransaction(); _collector = collector; Number active = (Number) conf.get(Config.TOPOLOGY_MAX_SPOUT_PENDING); if(active==null) { _maxTransactionActive = 1; } else { _maxTransactionActive = active.intValue(); } _attemptIds = getStoredCurrAttempts(_currTransaction, _maxTransactionActive); for(int i=0; i<_spouts.size(); i++) { String txId = _managedSpoutIds.get(i); _coordinators.add(_spouts.get(i).getCoordinator(txId, conf, context)); } LOG.debug("Opened {}", this); } @Override public void close() { for(TransactionalState state: _states) { state.close(); } LOG.debug("Closed {}", this); } @Override public void declareOutputFields(OutputFieldsDeclarer declarer) { // in partitioned example, in case an emitter task receives a later transaction than it's emitted so far, // when it sees the earlier txid it should know to emit nothing declarer.declareStream(BATCH_STREAM_ID, new Fields("tx")); declarer.declareStream(COMMIT_STREAM_ID, new Fields("tx")); declarer.declareStream(SUCCESS_STREAM_ID, new Fields("tx")); } @Override public Map<String, Object> getComponentConfiguration() { Config ret = new Config(); ret.setMaxTaskParallelism(1); ret.registerSerialization(TransactionAttempt.class); return ret; } //...... }
- prepare方法首先从Config.TOPOLOGY_TRIDENT_BATCH_EMIT_INTERVAL_MILLIS(
topology.trident.batch.emit.interval.millis,在defaults.yaml默认为500
)读取触发batch的频率配置,然后创建WindowedTimeThrottler,其maxAmt值为1 - 这里使用TransactionalState在zookeeper上维护transactional状态
- 之后读取Config.TOPOLOGY_MAX_SPOUT_PENDING(
topology.max.spout.pending,在defaults.yaml中默认为null
)设置_maxTransactionActive,如果为null,则设置为1
MasterBatchCoordinator.nextTuple
storm-1.2.2/storm-core/src/jvm/org/apache/storm/trident/topology/MasterBatchCoordinator.java
@Override public void nextTuple() { sync(); } private void sync() { // note that sometimes the tuples active may be less than max_spout_pending, e.g. // max_spout_pending = 3 // tx 1, 2, 3 active, tx 2 is acked. there won't be a commit for tx 2 (because tx 1 isn't committed yet), // and there won't be a batch for tx 4 because there's max_spout_pending tx active TransactionStatus maybeCommit = _activeTx.get(_currTransaction); if(maybeCommit!=null && maybeCommit.status == AttemptStatus.PROCESSED) { maybeCommit.status = AttemptStatus.COMMITTING; _collector.emit(COMMIT_STREAM_ID, new Values(maybeCommit.attempt), maybeCommit.attempt); LOG.debug("Emitted on [stream = {}], [tx_status = {}], [{}]", COMMIT_STREAM_ID, maybeCommit, this); } if(_active) { if(_activeTx.size() < _maxTransactionActive) { Long curr = _currTransaction; for(int i=0; i<_maxTransactionActive; i++) { if(!_activeTx.containsKey(curr) && isReady(curr)) { // by using a monotonically increasing attempt id, downstream tasks // can be memory efficient by clearing out state for old attempts // as soon as they see a higher attempt id for a transaction Integer attemptId = _attemptIds.get(curr); if(attemptId==null) { attemptId = 0; } else { attemptId++; } _attemptIds.put(curr, attemptId); for(TransactionalState state: _states) { state.setData(CURRENT_ATTEMPTS, _attemptIds); } TransactionAttempt attempt = new TransactionAttempt(curr, attemptId); final TransactionStatus newTransactionStatus = new TransactionStatus(attempt); _activeTx.put(curr, newTransactionStatus); _collector.emit(BATCH_STREAM_ID, new Values(attempt), attempt); LOG.debug("Emitted on [stream = {}], [tx_attempt = {}], [tx_status = {}], [{}]", BATCH_STREAM_ID, attempt, newTransactionStatus, this); _throttler.markEvent(); } curr = nextTransactionId(curr); } } } }
- nextTuple就是调用sync方法,该方法在ack及fail中均有调用;sync方法首先根据事务状态,如果需要提交,则会往MasterBatchCoordinator.COMMIT_STREAM_ID(
$commit
)发送tuple;之后根据_maxTransactionActive以及WindowedTimeThrottler限制,符合要求才启动新的TransactionAttempt,往MasterBatchCoordinator.BATCH_STREAM_ID($batch
)发送tuple,同时对WindowedTimeThrottler标记下windowEvent数量
MasterBatchCoordinator.ack
storm-1.2.2/storm-core/src/jvm/org/apache/storm/trident/topology/MasterBatchCoordinator.java
@Override public void ack(Object msgId) { TransactionAttempt tx = (TransactionAttempt) msgId; TransactionStatus status = _activeTx.get(tx.getTransactionId()); LOG.debug("Ack. [tx_attempt = {}], [tx_status = {}], [{}]", tx, status, this); if(status!=null && tx.equals(status.attempt)) { if(status.status==AttemptStatus.PROCESSING) { status.status = AttemptStatus.PROCESSED; LOG.debug("Changed status. [tx_attempt = {}] [tx_status = {}]", tx, status); } else if(status.status==AttemptStatus.COMMITTING) { _activeTx.remove(tx.getTransactionId()); _attemptIds.remove(tx.getTransactionId()); _collector.emit(SUCCESS_STREAM_ID, new Values(tx)); _currTransaction = nextTransactionId(tx.getTransactionId()); for(TransactionalState state: _states) { state.setData(CURRENT_TX, _currTransaction); } LOG.debug("Emitted on [stream = {}], [tx_attempt = {}], [tx_status = {}], [{}]", SUCCESS_STREAM_ID, tx, status, this); } sync(); } }
- ack主要是根据当前事务状态进行不同操作,如果之前是AttemptStatus.PROCESSING状态,则更新为AttemptStatus.PROCESSED;如果之前是AttemptStatus.COMMITTING,则移除当前事务,然后往MasterBatchCoordinator.SUCCESS_STREAM_ID(
$success
)发送tuple,更新_currTransaction为nextTransactionId;最后再调用sync触发新的TransactionAttempt
MasterBatchCoordinator.fail
storm-1.2.2/storm-core/src/jvm/org/apache/storm/trident/topology/MasterBatchCoordinator.java
@Override public void fail(Object msgId) { TransactionAttempt tx = (TransactionAttempt) msgId; TransactionStatus stored = _activeTx.remove(tx.getTransactionId()); LOG.debug("Fail. [tx_attempt = {}], [tx_status = {}], [{}]", tx, stored, this); if(stored!=null && tx.equals(stored.attempt)) { _activeTx.tailMap(tx.getTransactionId()).clear(); sync(); } }
- fail方法将当前事务从_activeTx中移除,然后清空_activeTx中txId大于这个失败txId的数据,最后再调用sync判断是否该触发新的TransactionAttempt(
注意这里没有变更_currTransaction,因而sync方法触发新的TransactionAttempt的_txid还是当前这个失败的_currTransaction
)
TridentSpoutCoordinator
storm-1.2.2/storm-core/src/jvm/org/apache/storm/trident/spout/TridentSpoutCoordinator.java
public class TridentSpoutCoordinator implements IBasicBolt { public static final Logger LOG = LoggerFactory.getLogger(TridentSpoutCoordinator.class); private static final String META_DIR = "meta"; ITridentSpout<Object> _spout; ITridentSpout.BatchCoordinator<Object> _coord; RotatingTransactionalState _state; TransactionalState _underlyingState; String _id; public TridentSpoutCoordinator(String id, ITridentSpout<Object> spout) { _spout = spout; _id = id; } @Override public void prepare(Map conf, TopologyContext context) { _coord = _spout.getCoordinator(_id, conf, context); _underlyingState = TransactionalState.newCoordinatorState(conf, _id); _state = new RotatingTransactionalState(_underlyingState, META_DIR); } @Override public void execute(Tuple tuple, BasicOutputCollector collector) { TransactionAttempt attempt = (TransactionAttempt) tuple.getValue(0); if(tuple.getSourceStreamId().equals(MasterBatchCoordinator.SUCCESS_STREAM_ID)) { _state.cleanupBefore(attempt.getTransactionId()); _coord.success(attempt.getTransactionId()); } else { long txid = attempt.getTransactionId(); Object prevMeta = _state.getPreviousState(txid); Object meta = _coord.initializeTransaction(txid, prevMeta, _state.getState(txid)); _state.overrideState(txid, meta); collector.emit(MasterBatchCoordinator.BATCH_STREAM_ID, new Values(attempt, meta)); } } @Override public void cleanup() { _coord.close(); _underlyingState.close(); } @Override public void declareOutputFields(OutputFieldsDeclarer declarer) { declarer.declareStream(MasterBatchCoordinator.BATCH_STREAM_ID, new Fields("tx", "metadata")); } @Override public Map<String, Object> getComponentConfiguration() { Config ret = new Config(); ret.setMaxTaskParallelism(1); return ret; } }
- TridentSpoutCoordinator的nextTuple根据streamId分别做不同的处理
- 如果是MasterBatchCoordinator.SUCCESS_STREAM_ID(
$success
)则表示master那边接收到了ack已经成功了,然后coordinator就清除该txId之前的数据,然后回调ITridentSpout.BatchCoordinator的success方法 - 如果是MasterBatchCoordinator.BATCH_STREAM_ID(
$batch
)则要启动新的TransactionAttempt,则往MasterBatchCoordinator.BATCH_STREAM_ID($batch
)发送tuple,该tuple会被下游的bolt接收(在本实例就是使用TridentSpoutExecutor包装了用户spout的TridentBoltExecutor
)
TridentBoltExecutor
storm-1.2.2/storm-core/src/jvm/org/apache/storm/trident/topology/TridentBoltExecutor.java
public class TridentBoltExecutor implements IRichBolt { public static final String COORD_STREAM_PREFIX = "$coord-"; public static String COORD_STREAM(String batch) { return COORD_STREAM_PREFIX + batch; } RotatingMap<Object, TrackedBatch> _batches; @Override public void prepare(Map conf, TopologyContext context, OutputCollector collector) { _messageTimeoutMs = context.maxTopologyMessageTimeout() * 1000L; _lastRotate = System.currentTimeMillis(); _batches = new RotatingMap<>(2); _context = context; _collector = collector; _coordCollector = new CoordinatedOutputCollector(collector); _coordOutputCollector = new BatchOutputCollectorImpl(new OutputCollector(_coordCollector)); _coordConditions = (Map) context.getExecutorData("__coordConditions"); if(_coordConditions==null) { _coordConditions = new HashMap<>(); for(String batchGroup: _coordSpecs.keySet()) { CoordSpec spec = _coordSpecs.get(batchGroup); CoordCondition cond = new CoordCondition(); cond.commitStream = spec.commitStream; cond.expectedTaskReports = 0; for(String comp: spec.coords.keySet()) { CoordType ct = spec.coords.get(comp); if(ct.equals(CoordType.single())) { cond.expectedTaskReports+=1; } else { cond.expectedTaskReports+=context.getComponentTasks(comp).size(); } } cond.targetTasks = new HashSet<>(); for(String component: Utils.get(context.getThisTargets(), COORD_STREAM(batchGroup), new HashMap<String, Grouping>()).keySet()) { cond.targetTasks.addAll(context.getComponentTasks(component)); } _coordConditions.put(batchGroup, cond); } context.setExecutorData("_coordConditions", _coordConditions); } _bolt.prepare(conf, context, _coordOutputCollector); } //...... @Override public void cleanup() { _bolt.cleanup(); } @Override public void declareOutputFields(OutputFieldsDeclarer declarer) { _bolt.declareOutputFields(declarer); for(String batchGroup: _coordSpecs.keySet()) { declarer.declareStream(COORD_STREAM(batchGroup), true, new Fields("id", "count")); } } @Override public Map<String, Object> getComponentConfiguration() { Map<String, Object> ret = _bolt.getComponentConfiguration(); if(ret==null) ret = new HashMap<>(); ret.put(Config.TOPOLOGY_TICK_TUPLE_FREQ_SECS, 5); // TODO: Need to be able to set the tick tuple time to the message timeout, ideally without parameterization return ret; } }
- prepare的时候,先创建了CoordinatedOutputCollector,之后用OutputCollector包装,再最后包装为BatchOutputCollectorImpl,调用ITridentBatchBolt.prepare方法,ITridentBatchBolt这里头使用的实现类为TridentSpoutExecutor
- prepare初始化了RotatingMap<Object, TrackedBatch> _batches = new RotatingMap<>(2);
- prepare主要做的是构建CoordCondition,这里主要是计算expectedTaskReports以及targetTasks
TridentBoltExecutor.execute
storm-1.2.2/storm-core/src/jvm/org/apache/storm/trident/topology/TridentBoltExecutor.java
@Override public void execute(Tuple tuple) { if(TupleUtils.isTick(tuple)) { long now = System.currentTimeMillis(); if(now - _lastRotate > _messageTimeoutMs) { _batches.rotate(); _lastRotate = now; } return; } String batchGroup = _batchGroupIds.get(tuple.getSourceGlobalStreamId()); if(batchGroup==null) { // this is so we can do things like have simple DRPC that doesn't need to use batch processing _coordCollector.setCurrBatch(null); _bolt.execute(null, tuple); _collector.ack(tuple); return; } IBatchID id = (IBatchID) tuple.getValue(0); //get transaction id //if it already exists and attempt id is greater than the attempt there TrackedBatch tracked = (TrackedBatch) _batches.get(id.getId()); // if(_batches.size() > 10 && _context.getThisTaskIndex() == 0) { // System.out.println("Received in " + _context.getThisComponentId() + " " + _context.getThisTaskIndex() // + " (" + _batches.size() + ")" + // "\ntuple: " + tuple + // "\nwith tracked " + tracked + // "\nwith id " + id + // "\nwith group " + batchGroup // + "\n"); // // } //System.out.println("Num tracked: " + _batches.size() + " " + _context.getThisComponentId() + " " + _context.getThisTaskIndex()); // this code here ensures that only one attempt is ever tracked for a batch, so when // failures happen you don't get an explosion in memory usage in the tasks if(tracked!=null) { if(id.getAttemptId() > tracked.attemptId) { _batches.remove(id.getId()); tracked = null; } else if(id.getAttemptId() < tracked.attemptId) { // no reason to try to execute a previous attempt than we've already seen return; } } if(tracked==null) { tracked = new TrackedBatch(new BatchInfo(batchGroup, id, _bolt.initBatchState(batchGroup, id)), _coordConditions.get(batchGroup), id.getAttemptId()); _batches.put(id.getId(), tracked); } _coordCollector.setCurrBatch(tracked); //System.out.println("TRACKED: " + tracked + " " + tuple); TupleType t = getTupleType(tuple, tracked); if(t==TupleType.COMMIT) { tracked.receivedCommit = true; checkFinish(tracked, tuple, t); } else if(t==TupleType.COORD) { int count = tuple.getInteger(1); tracked.reportedTasks++; tracked.expectedTupleCount+=count; checkFinish(tracked, tuple, t); } else { tracked.receivedTuples++; boolean success = true; try { _bolt.execute(tracked.info, tuple); if(tracked.condition.expectedTaskReports==0) { success = finishBatch(tracked, tuple); } } catch(FailedException e) { failBatch(tracked, e); } if(success) { _collector.ack(tuple); } else { _collector.fail(tuple); } } _coordCollector.setCurrBatch(null); } private TupleType getTupleType(Tuple tuple, TrackedBatch batch) { CoordCondition cond = batch.condition; if(cond.commitStream!=null && tuple.getSourceGlobalStreamId().equals(cond.commitStream)) { return TupleType.COMMIT; } else if(cond.expectedTaskReports > 0 && tuple.getSourceStreamId().startsWith(COORD_STREAM_PREFIX)) { return TupleType.COORD; } else { return TupleType.REGULAR; } } private void failBatch(TrackedBatch tracked, FailedException e) { if(e!=null && e instanceof ReportedFailedException) { _collector.reportError(e); } tracked.failed = true; if(tracked.delayedAck!=null) { _collector.fail(tracked.delayedAck); tracked.delayedAck = null; } }
- TridentBoltExecutor的execute方法首先判断是否是tickTuple,如果是判断距离_lastRotate的时间(
prepare的时候初始化为当时的时间
)是否超过_messageTimeoutMs,如果是则进行_batches.rotate()操作;tickTuple的发射频率为Config.TOPOLOGY_TICK_TUPLE_FREQ_SECS(topology.tick.tuple.freq.secs
),在TridentBoltExecutor中它被设置为5秒;_messageTimeoutMs为context.maxTopologyMessageTimeout() * 1000L,它从整个topology的component的Config.TOPOLOGY_MESSAGE_TIMEOUT_SECS(topology.message.timeout.secs,defaults.yaml中默认为30
)最大值*1000 - _batches按TransactionAttempt的txId来存储TrackedBatch信息,如果没有则创建一个新的TrackedBatch;创建TrackedBatch时,会回调_bolt的initBatchState方法
- 之后判断tuple的类型,这里分为TupleType.COMMIT、TupleType.COORD、TupleType.REGULAR;如果是TupleType.COMMIT类型,则设置tracked.receivedCommit为true,然后调用checkFinish方法;如果是TupleType.COORD类型,则更新reportedTasks及expectedTupleCount计数,再调用checkFinish方法;如果是TupleType.REGULAR类型(
coordinator发送过来的batch信息
),则更新receivedTuples计数,然后调用_bolt.execute方法(这里的_bolt为TridentSpoutExecutor
),对于tracked.condition.expectedTaskReports==0的则立马调用finishBatch,将该batch从_batches中移除;如果有FailedException则直接failBatch上报error信息,之后对tuple进行ack或者fail;如果下游是each操作,一个batch中如果是部分抛出FailedException异常,则需要等到所有batch中的tuple执行完,等到TupleType.COORD触发检测checkFinish,这个时候才能fail通知到master,也就是有一些滞后性,比如这个batch中有3个tuple,第二个tuple抛出FailedException,还会继续执行第三个tuple,最后该batch的tuple都处理完了,才收到TupleType.COORD触发检测checkFinish。
TridentBoltExecutor.checkFinish
storm-1.2.2/storm-core/src/jvm/org/apache/storm/trident/topology/TridentBoltExecutor.java
private void checkFinish(TrackedBatch tracked, Tuple tuple, TupleType type) { if(tracked.failed) { failBatch(tracked); _collector.fail(tuple); return; } CoordCondition cond = tracked.condition; boolean delayed = tracked.delayedAck==null && (cond.commitStream!=null && type==TupleType.COMMIT || cond.commitStream==null); if(delayed) { tracked.delayedAck = tuple; } boolean failed = false; if(tracked.receivedCommit && tracked.reportedTasks == cond.expectedTaskReports) { if(tracked.receivedTuples == tracked.expectedTupleCount) { finishBatch(tracked, tuple); } else { //TODO: add logging that not all tuples were received failBatch(tracked); _collector.fail(tuple); failed = true; } } if(!delayed && !failed) { _collector.ack(tuple); } } private void failBatch(TrackedBatch tracked) { failBatch(tracked, null); } private void failBatch(TrackedBatch tracked, FailedException e) { if(e!=null && e instanceof ReportedFailedException) { _collector.reportError(e); } tracked.failed = true; if(tracked.delayedAck!=null) { _collector.fail(tracked.delayedAck); tracked.delayedAck = null; } }
- TridentBoltExecutor在execute的时候,在tuple是TupleType.COMMIT以及TupleType.COORD的时候都会调用checkFinish
- 一旦_bolt.execute(tracked.info, tuple)方法抛出FailedException,则会调用failBatch,它会标记tracked.failed为true
- checkFinish在发现tracked.failed为true的时候,会调用_collector.fail(tuple),然后回调MasterBatchCoordinator的fail方法
TridentSpoutExecutor
storm-1.2.2/storm-core/src/jvm/org/apache/storm/trident/spout/TridentSpoutExecutor.java
public class TridentSpoutExecutor implements ITridentBatchBolt { public static final String ID_FIELD = "$tx"; public static final Logger LOG = LoggerFactory.getLogger(TridentSpoutExecutor.class); AddIdCollector _collector; ITridentSpout<Object> _spout; ITridentSpout.Emitter<Object> _emitter; String _streamName; String _txStateId; TreeMap<Long, TransactionAttempt> _activeBatches = new TreeMap<>(); public TridentSpoutExecutor(String txStateId, String streamName, ITridentSpout<Object> spout) { _txStateId = txStateId; _spout = spout; _streamName = streamName; } @Override public void prepare(Map conf, TopologyContext context, BatchOutputCollector collector) { _emitter = _spout.getEmitter(_txStateId, conf, context); _collector = new AddIdCollector(_streamName, collector); } @Override public void execute(BatchInfo info, Tuple input) { // there won't be a BatchInfo for the success stream TransactionAttempt attempt = (TransactionAttempt) input.getValue(0); if(input.getSourceStreamId().equals(MasterBatchCoordinator.COMMIT_STREAM_ID)) { if(attempt.equals(_activeBatches.get(attempt.getTransactionId()))) { ((ICommitterTridentSpout.Emitter) _emitter).commit(attempt); _activeBatches.remove(attempt.getTransactionId()); } else { throw new FailedException("Received commit for different transaction attempt"); } } else if(input.getSourceStreamId().equals(MasterBatchCoordinator.SUCCESS_STREAM_ID)) { // valid to delete before what's been committed since // those batches will never be accessed again _activeBatches.headMap(attempt.getTransactionId()).clear(); _emitter.success(attempt); } else { _collector.setBatch(info.batchId); _emitter.emitBatch(attempt, input.getValue(1), _collector); _activeBatches.put(attempt.getTransactionId(), attempt); } } @Override public void cleanup() { _emitter.close(); } @Override public void declareOutputFields(OutputFieldsDeclarer declarer) { List<String> fields = new ArrayList<>(_spout.getOutputFields().toList()); fields.add(0, ID_FIELD); declarer.declareStream(_streamName, new Fields(fields)); } @Override public Map<String, Object> getComponentConfiguration() { return _spout.getComponentConfiguration(); } @Override public void finishBatch(BatchInfo batchInfo) { } @Override public Object initBatchState(String batchGroup, Object batchId) { return null; } }
- TridentSpoutExecutor使用的BatchOutputCollector为TridentBoltExecutor在prepare方法构造的,经过几层包装,先是CoordinatedOutputCollector,然后是OutputCollector,最后是BatchOutputCollectorImpl;这里最主要的是CoordinatedOutputCollector包装,它维护每个taskId发出的tuple的数量;而在这个executor的prepare方法里头,该collector又被包装为AddIdCollector,主要是添加了batchId信息(
即TransactionAttempt信息
) - TridentSpoutExecutor的ITridentSpout就是包装了用户设置的原始spout(
IBatchSpout类型
)的BatchSpoutExecutor(假设原始spout是IBatchSpout类型的,因而会通过BatchSpoutExecutor包装为ITridentSpout类型
),其execute方法根据不同stream类型进行不同处理,如果是master发过来的MasterBatchCoordinator.COMMIT_STREAM_ID($commit
)则调用emitter的commit方法提交当前TransactionAttempt(本文的实例没有commit信息
),然后将该tx从_activeBatches中移除;如果是master发过来的MasterBatchCoordinator.SUCCESS_STREAM_ID($success
)则先把_activeBatches中txId小于该txId的TransactionAttempt移除,然后调用emitter的success方法,标记TransactionAttempt成功,该方法回调原始spout(IBatchSpout类型
)的ack方法 - 非MasterBatchCoordinator.COMMIT_STREAM_ID(
$commit
)及MasterBatchCoordinator.SUCCESS_STREAM_ID($success
)类型的tuple,则是启动batch的消息,这里设置batchId,然后调用emitter的emitBatch进行数据发送(这里传递的batchId就是TransactionAttempt的txId
),同时将该TransactionAttempt放入_activeBatches中(这里的batch相当于TransactionAttempt
)
FixedBatchSpout
storm-1.2.2/storm-core/src/jvm/org/apache/storm/trident/testing/FixedBatchSpout.java
public class FixedBatchSpout implements IBatchSpout { Fields fields; List<Object>[] outputs; int maxBatchSize; HashMap<Long, List<List<Object>>> batches = new HashMap<Long, List<List<Object>>>(); public FixedBatchSpout(Fields fields, int maxBatchSize, List<Object>... outputs) { this.fields = fields; this.outputs = outputs; this.maxBatchSize = maxBatchSize; } int index = 0; boolean cycle = false; public void setCycle(boolean cycle) { this.cycle = cycle; } @Override public void open(Map conf, TopologyContext context) { index = 0; } @Override public void emitBatch(long batchId, TridentCollector collector) { List<List<Object>> batch = this.batches.get(batchId); if(batch == null){ batch = new ArrayList<List<Object>>(); if(index>=outputs.length && cycle) { index = 0; } for(int i=0; index < outputs.length && i < maxBatchSize; index++, i++) { batch.add(outputs[index]); } this.batches.put(batchId, batch); } for(List<Object> list : batch){ collector.emit(list); } } @Override public void ack(long batchId) { this.batches.remove(batchId); } @Override public void close() { } @Override public Map<String, Object> getComponentConfiguration() { Config conf = new Config(); conf.setMaxTaskParallelism(1); return conf; } @Override public Fields getOutputFields() { return fields; } }
- 用户使用的spout是IBatchSpout类型,这里缓存了每个batchId对应的tuple数据,实现的是transactional spout的语义
TridentTopology.newStream
storm-1.2.2/storm-core/src/jvm/org/apache/storm/trident/TridentTopology.java
public Stream newStream(String txId, IRichSpout spout) { return newStream(txId, new RichSpoutBatchExecutor(spout)); } public Stream newStream(String txId, IBatchSpout spout) { Node n = new SpoutNode(getUniqueStreamId(), spout.getOutputFields(), txId, spout, SpoutNode.SpoutType.BATCH); return addNode(n); } public Stream newStream(String txId, ITridentSpout spout) { Node n = new SpoutNode(getUniqueStreamId(), spout.getOutputFields(), txId, spout, SpoutNode.SpoutType.BATCH); return addNode(n); } public Stream newStream(String txId, IPartitionedTridentSpout spout) { return newStream(txId, new PartitionedTridentSpoutExecutor(spout)); } public Stream newStream(String txId, IOpaquePartitionedTridentSpout spout) { return newStream(txId, new OpaquePartitionedTridentSpoutExecutor(spout)); } public Stream newStream(String txId, ITridentDataSource dataSource) { if (dataSource instanceof IBatchSpout) { return newStream(txId, (IBatchSpout) dataSource); } else if (dataSource instanceof ITridentSpout) { return newStream(txId, (ITridentSpout) dataSource); } else if (dataSource instanceof IPartitionedTridentSpout) { return newStream(txId, (IPartitionedTridentSpout) dataSource); } else if (dataSource instanceof IOpaquePartitionedTridentSpout) { return newStream(txId, (IOpaquePartitionedTridentSpout) dataSource); } else { throw new UnsupportedOperationException("Unsupported stream"); } }
- 用户在TridentTopology.newStream可以直接使用IBatchSpout类似的spout,使用它的好处就是TridentTopology在build的时候会使用BatchSpoutExecutor将其包装为ITridentSpout类型(
省得用户再去实现ITridentSpout的相关接口,屏蔽trident spout的相关逻辑,使得之前一直使用普通topology的用户可以快速上手trident topology
) - BatchSpoutExecutor实现了ITridentSpout接口,将IBatchSpout适配为ITridentSpout,使用的coordinator是EmptyCoordinator,使用的emitter是BatchSpoutEmitter
- 如果用户在TridentTopology.newStream使用的spout是IPartitionedTridentSpout类型,则TridentTopology在newStream方法内部会使用PartitionedTridentSpoutExecutor将其包装为ITridentSpout类型,对于IOpaquePartitionedTridentSpout则使用OpaquePartitionedTridentSpoutExecutor将其包装为ITridentSpout类型
小结
- TridentTopology在newStream或者build方法里头会将ITridentDataSource中不是ITridentSpout类型的IBatchSpout(
在build方法
)、IPartitionedTridentSpout(在newStream方法
)、IOpaquePartitionedTridentSpout(在newStream方法
)适配为ITridentSpout类型;分别使用BatchSpoutExecutor、PartitionedTridentSpoutExecutor、OpaquePartitionedTridentSpoutExecutor进行适配(TridentTopologyBuilder在buildTopology的时候,对于ITridentSpout类型的spout先用TridentSpoutExecutor包装,再用TridentBoltExecutor包装,最后转换为bolt,而整个TridentTopology真正的spout就是MasterBatchCoordinator;这里可以看到一个IBatchSpout的spout先经过BatchSpoutExecutor包装为ITridentSpout类型,之后再经过TridentSpoutExecutor及TridentBoltExecutor包装为bolt
) - IBatchSpout的ack是针对batch维度的,也就是TransactionAttempt维度,注意这里没有fail方法,如果emitBatch方法抛出了FailedException异常,则TridentBoltExecutor会调用failBatch方法(
一个batch的tuples会等所有tuple执行完再触发checkFinish
),进行reportError以及标记TrackedBatch的failed为true,之后TridentBoltExecutor在checkFinish的时候,一旦发现tracked.failed为true的时候,会调用_collector.fail(tuple),然后回调MasterBatchCoordinator的fail方法 - MasterBatchCoordinator的fail方法会将当前TransactionAttempt从_activeTx移除,然后一并移除txId大于失败的txId的数据,最后调用sync方法继续TransactionAttempt(
注意这里没有更改_currTransaction值,因而会继续从失败的txId开始重试,只有在ack方法里头会更改_currTransaction为nextTransactionId
) - TridentBoltExecutor的execute方法会根据tickTuple来检测距离上次rotate是否超过_messageTimeoutMs(
取component中Config.TOPOLOGY_MESSAGE_TIMEOUT_SECS最大值*1000,这里*1000是将秒转换为毫秒
),超过的话进行rotate操作,_batches的最后一个bucket将会被移除掉;这里的tickTuple的频率为5秒,Config.TOPOLOGY_MESSAGE_TIMEOUT_SECS按30秒算的话,_messageTimeoutMs为30*1000,相当于每5秒检测一下距离上次rotate时间是否超过30秒,如果超过则进行rotate,丢弃最后一个bucket的数据(TrackedBatch
),这里相当于重置超时的TrackedBatch信息 - 关于MasterBatchCoordinator的fail的情况,有几种情况,一种是下游componnent主动抛出FailException,这个时候会触发master的fail,再次重试TransactionAttempt;一种是下游component处理tuple时间超过Config.TOPOLOGY_MESSAGE_TIMEOUT_SECS(
topology.message.timeout.secs,defaults.yaml中默认为30
),这个时候ack会触发master的fail,导致该TransactionAttempt失败继续重试,目前没有对attempt的次数做限制,实际生产过程中要注意,因为只要该batchId的一个tuple失败,整个batchId的tuples都会重发,这个时候下游如果没有做好处理,可能会出现一个batchId中前面部分tuple成功,后面部分失败,导致成功的tuple不断重复处理(要避免失败的batch中tuples部分处理成功部分处理失败这个问题就需要配合使用Trident的State
)。
doc
来源:oschina
链接:https://my.oschina.net/u/2922256/blog/2874373