I am implementing a custom directed acyclic graph (DAG) RNN layer where the input is a DAG and each DAG node may depend on any number of other DAG nodes that come before it. For