序
本文主要研究一下storm trident的state
StateType
storm-2.0.0/storm-client/src/jvm/org/apache/storm/trident/state/StateType.java
public enum StateType {
NON_TRANSACTIONAL,
TRANSACTIONAL,
OPAQUE
}
- StateType有三种类型,NON_TRANSACTIONAL非事务性,TRANSACTIONAL事务性,OPAQUE不透明事务
- 对应的spout也有三类,non-transactional、transactional以及opaque transactional
State
storm-2.0.0/storm-client/src/jvm/org/apache/storm/trident/state/State.java
/**
* There's 3 different kinds of state:
*
* 1. non-transactional: ignores commits, updates are permanent. no rollback. a cassandra incrementing state would be like this 2.
* repeat-transactional: idempotent as long as all batches for a txid are identical 3. opaque-transactional: the most general kind of state.
* updates are always done based on the previous version of the value if the current commit = latest stored commit Idempotent even if the
* batch for a txid can change.
*
* repeat transactional is idempotent for transactional spouts opaque transactional is idempotent for opaque or transactional spouts
*
* Trident should log warnings when state is idempotent but updates will not be idempotent because of spout
*/
// retrieving is encapsulated in Retrieval interface
public interface State {
void beginCommit(Long txid); // can be null for things like partitionPersist occuring off a DRPC stream
void commit(Long txid);
}
- non-transactional,忽略commits,updates是持久的,没有rollback,cassandra的incrementing state属于这个类型;at-most或者at-least once语义
- repeat-transactional,简称transactional,要求不管是否replayed,同一个batch的txid始终相同,而且里头的tuple也不变,一个tuple只属于一个batch,各个batch之间不会重叠;对于state更新来说,replay遇到相同的txid,即可跳过;在数据库需要较少的state,但是容错性较差,保证exactly once语义
- opaque-transactional,简称opaque,是用的比较多的一类,它的容错性比transactional强,它不要求一个tuple始终在同一个batch/txid,也就是说允许一个tuple在这个batch处理失败,但是在其他batch中处理成功,但是它可以保证每个tuple只在某一个batch中exactly成功处理一次;OpaqueTridentKafkaSpout就是这个类型的实现,它能容忍kafka节点丢失的错误;对于state更新来说,replay遇到相同的txid,则需要基于prevValue使用当前的值覆盖掉;在数据库需要更多空间来存储state,但是容错性好,保证exactly once语义
MapState
storm-2.0.0/storm-client/src/jvm/org/apache/storm/trident/state/map/MapState.java
public interface MapState<T> extends ReadOnlyMapState<T> {
List<T> multiUpdate(List<List<Object>> keys, List<ValueUpdater> updaters);
void multiPut(List<List<Object>> keys, List<T> vals);
}
- MapState继承了ReadOnlyMapState接口,而ReadOnlyMapState则继承了State接口
- 这里主要举MapState的几个实现类分析一下
NonTransactionalMap
storm-2.0.0/storm-client/src/jvm/org/apache/storm/trident/state/map/NonTransactionalMap.java
public class NonTransactionalMap<T> implements MapState<T> {
IBackingMap<T> _backing;
protected NonTransactionalMap(IBackingMap<T> backing) {
_backing = backing;
}
public static <T> MapState<T> build(IBackingMap<T> backing) {
return new NonTransactionalMap<T>(backing);
}
@Override
public List<T> multiGet(List<List<Object>> keys) {
return _backing.multiGet(keys);
}
@Override
public List<T> multiUpdate(List<List<Object>> keys, List<ValueUpdater> updaters) {
List<T> curr = _backing.multiGet(keys);
List<T> ret = new ArrayList<T>(curr.size());
for (int i = 0; i < curr.size(); i++) {
T currVal = curr.get(i);
ValueUpdater<T> updater = updaters.get(i);
ret.add(updater.update(currVal));
}
_backing.multiPut(keys, ret);
return ret;
}
@Override
public void multiPut(List<List<Object>> keys, List<T> vals) {
_backing.multiPut(keys, vals);
}
@Override
public void beginCommit(Long txid) {
}
@Override
public void commit(Long txid) {
}
}
- NonTransactionalMap包装了IBackingMap,beginCommit及commit方法都不做任何操作
- multiUpdate方法构造List<T> ret,然后使用IBackingMap的multiPut来实现
TransactionalMap
storm-2.0.0/storm-client/src/jvm/org/apache/storm/trident/state/map/TransactionalMap.java
public class TransactionalMap<T> implements MapState<T> {
CachedBatchReadsMap<TransactionalValue> _backing;
Long _currTx;
protected TransactionalMap(IBackingMap<TransactionalValue> backing) {
_backing = new CachedBatchReadsMap(backing);
}
public static <T> MapState<T> build(IBackingMap<TransactionalValue> backing) {
return new TransactionalMap<T>(backing);
}
@Override
public List<T> multiGet(List<List<Object>> keys) {
List<CachedBatchReadsMap.RetVal<TransactionalValue>> vals = _backing.multiGet(keys);
List<T> ret = new ArrayList<T>(vals.size());
for (CachedBatchReadsMap.RetVal<TransactionalValue> retval : vals) {
TransactionalValue v = retval.val;
if (v != null) {
ret.add((T) v.getVal());
} else {
ret.add(null);
}
}
return ret;
}
@Override
public List<T> multiUpdate(List<List<Object>> keys, List<ValueUpdater> updaters) {
List<CachedBatchReadsMap.RetVal<TransactionalValue>> curr = _backing.multiGet(keys);
List<TransactionalValue> newVals = new ArrayList<TransactionalValue>(curr.size());
List<List<Object>> newKeys = new ArrayList();
List<T> ret = new ArrayList<T>();
for (int i = 0; i < curr.size(); i++) {
CachedBatchReadsMap.RetVal<TransactionalValue> retval = curr.get(i);
TransactionalValue<T> val = retval.val;
ValueUpdater<T> updater = updaters.get(i);
TransactionalValue<T> newVal;
boolean changed = false;
if (val == null) {
newVal = new TransactionalValue<T>(_currTx, updater.update(null));
changed = true;
} else {
if (_currTx != null && _currTx.equals(val.getTxid()) && !retval.cached) {
newVal = val;
} else {
newVal = new TransactionalValue<T>(_currTx, updater.update(val.getVal()));
changed = true;
}
}
ret.add(newVal.getVal());
if (changed) {
newVals.add(newVal);
newKeys.add(keys.get(i));
}
}
if (!newKeys.isEmpty()) {
_backing.multiPut(newKeys, newVals);
}
return ret;
}
@Override
public void multiPut(List<List<Object>> keys, List<T> vals) {
List<TransactionalValue> newVals = new ArrayList<TransactionalValue>(vals.size());
for (T val : vals) {
newVals.add(new TransactionalValue<T>(_currTx, val));
}
_backing.multiPut(keys, newVals);
}
@Override
public void beginCommit(Long txid) {
_currTx = txid;
_backing.reset();
}
@Override
public void commit(Long txid) {
_currTx = null;
_backing.reset();
}
}
- TransactionalMap采取的是CachedBatchReadsMap<TransactionalValue>,这里泛型使用的是TransactionalValue,beginCommit会设置当前的txid,重置_backing,commit的时候会重置txid,然后重置_backing
- multiUpdate方法中判断如果_currTx已经存在值,且该值!retval.cached(
即不是本次事务中multiPut进去的
),那么不会更新该值(skip the update
),使用newVal = val - multiPut方法构造批量的TransactionalValue,然后使用CachedBatchReadsMap.multiPut(List<List<Object>> keys, List<T> vals)方法,该方法更新值之后会更新到缓存
OpaqueMap
storm-2.0.0/storm-client/src/jvm/org/apache/storm/trident/state/map/OpaqueMap.java
public class OpaqueMap<T> implements MapState<T> {
CachedBatchReadsMap<OpaqueValue> _backing;
Long _currTx;
protected OpaqueMap(IBackingMap<OpaqueValue> backing) {
_backing = new CachedBatchReadsMap(backing);
}
public static <T> MapState<T> build(IBackingMap<OpaqueValue> backing) {
return new OpaqueMap<T>(backing);
}
@Override
public List<T> multiGet(List<List<Object>> keys) {
List<CachedBatchReadsMap.RetVal<OpaqueValue>> curr = _backing.multiGet(keys);
List<T> ret = new ArrayList<T>(curr.size());
for (CachedBatchReadsMap.RetVal<OpaqueValue> retval : curr) {
OpaqueValue val = retval.val;
if (val != null) {
if (retval.cached) {
ret.add((T) val.getCurr());
} else {
ret.add((T) val.get(_currTx));
}
} else {
ret.add(null);
}
}
return ret;
}
@Override
public List<T> multiUpdate(List<List<Object>> keys, List<ValueUpdater> updaters) {
List<CachedBatchReadsMap.RetVal<OpaqueValue>> curr = _backing.multiGet(keys);
List<OpaqueValue> newVals = new ArrayList<OpaqueValue>(curr.size());
List<T> ret = new ArrayList<T>();
for (int i = 0; i < curr.size(); i++) {
CachedBatchReadsMap.RetVal<OpaqueValue> retval = curr.get(i);
OpaqueValue<T> val = retval.val;
ValueUpdater<T> updater = updaters.get(i);
T prev;
if (val == null) {
prev = null;
} else {
if (retval.cached) {
prev = val.getCurr();
} else {
prev = val.get(_currTx);
}
}
T newVal = updater.update(prev);
ret.add(newVal);
OpaqueValue<T> newOpaqueVal;
if (val == null) {
newOpaqueVal = new OpaqueValue<T>(_currTx, newVal);
} else {
newOpaqueVal = val.update(_currTx, newVal);
}
newVals.add(newOpaqueVal);
}
_backing.multiPut(keys, newVals);
return ret;
}
@Override
public void multiPut(List<List<Object>> keys, List<T> vals) {
List<ValueUpdater> updaters = new ArrayList<ValueUpdater>(vals.size());
for (T val : vals) {
updaters.add(new ReplaceUpdater<T>(val));
}
multiUpdate(keys, updaters);
}
@Override
public void beginCommit(Long txid) {
_currTx = txid;
_backing.reset();
}
@Override
public void commit(Long txid) {
_currTx = null;
_backing.reset();
}
static class ReplaceUpdater<T> implements ValueUpdater<T> {
T _t;
public ReplaceUpdater(T t) {
_t = t;
}
@Override
public T update(Object stored) {
return _t;
}
}
}
- OpaqueMap采取的是CachedBatchReadsMap<OpaqueValue>,这里泛型使用的是OpaqueValue,beginCommit会设置当前的txid,重置_backing,commit的时候会重置txid,然后重置_backing
- 与TransactionalMap的不同,这里在multiPut的时候,使用的是ReplaceUpdater,然后调用multiUpdate强制覆盖
- multiUpdate方法与TransactionalMap的不同,它是基于prev值来进行update的,算出newVal
小结
- trident严格按batch的顺序更新state,比如txid为3的batch必须在txid为2的batch处理完之后才能处理
- state分三种类型,分别是non-transactional、transactional、opaque transactional,对应的spout也是这三种类型
- non-transactional无法保证exactly once,它可能是at-least once或者at-most once;其state计算参考NonTransactionalMap,对于beginCommit及commit操作都无处理
- transactional类型能够保证exactly once,但是要求比较严格,要同一个batch的txid及tuple在replayed的时候仍然保持一致,因此容错性差一点,但是它的state计算相对简单,参考TransactionalMap,遇到同一个txid的值,skip掉即可
- opaque transactional类型也能够保证exactly once,它允许一个tuple处理失败之后,出现在其他batch中处理,因而容错性好,但是state计算要多存储prev值,参考OpaqueMap,遇到同一个txid的值,使用prev值跟当前值进行覆盖
- trident将保证exactly once的state的计算都封装好了,使用的时候,在persistentAggregate传入相应的StateFactory即可,支持多种StateType的factory可以选择使用StateType属性,通过传入不同的参数构造不同transactional的state;也可以通过实现StateFactory自定义实现state factory,另外也可以通过继承BaseQueryFunction来自定义stateQuery查询,自定义更新的话,可以继承BaseStateUpdater,然后通过partitionPersist传入
doc
来源:oschina
链接:https://my.oschina.net/u/2922256/blog/2436027