问题
I am playing with a mixed multinomial discrete choice model in Tensorflow Probability. The function should take an input of a choice among 3 alternatives. The chosen alternative is specified by CHOSEN (a # observationsx3 tensor). I have a previous question but the code/question has changed quite a bit:
Tensorflow Probability Error: OperatorNotAllowedInGraphError: iterating over `tf.Tensor` is not allowed
Looking at the source code for Multinomial(), I should be able to give CHOSEN as an input to total_count and get the correct result based on how it appears in the log likelihood function. The model is a basic multinomial logit choice (or softmax function) where logits is the systematic utility for each alternative. I currently get (# observations=6768 and # alternatives=3):
ValueError: Dimensions must be equal, but are 3 and 6768 for '{{node Multinomial_1/sample/draw_sample/mul}} = Mul[T=DT_INT32](Multinomial_1/sample/draw_sample/ones_like, Multinomial_1/sample/Cast)' with input shapes: [3,6768], [6768,3].
I tried transposing the total_count tensor and get:
ValueError: The two structures don't have the same sequence length. Input structure has length 5, while shallow structure has length 10.
I have the following joint distribution function (plus helper function and log_prob() call):
def mmnl_func():
return tfd.JointDistributionSequential([
tfd.Normal(loc=0., scale=1e5), # mu_b_time
tfd.HalfCauchy(loc=0., scale=5), # sigma_b_time
lambda sigma_b_time,mu_b_time: tfd.MultivariateNormalDiag( # b_time
loc=affine(tf.ones([num_idx]), mu_b_time[..., tf.newaxis]),
scale_identity_multiplier=sigma_b_time),
tfd.Normal(loc=0, scale=1e5), # a_train
tfd.Normal(loc=0, scale=1e5), # a_car
tfd.Normal(loc=0, scale=1e5), # b_cost
lambda b_cost,a_car,a_train,b_time: tfd.Deterministic(loc=affine(DATA[:,0], tf.gather(b_time, IDX, axis=-1), (a_train + b_cost * DATA[:,1]))), # V1
lambda V1,b_cost,a_car,a_train,b_time: tfd.Deterministic(loc=affine(DATA[:,2], tf.gather(b_time, IDX, axis=-1), (b_cost * DATA[:,3]))), # V2
lambda V2,V1,b_cost,a_car,a_train,b_time: tfd.Deterministic(loc=affine(DATA[:,4], tf.gather(b_time, IDX, axis=-1), (a_car + b_cost * DATA[:,5]))), # V3
lambda V3,V2,V1: tfd.Multinomial( # y
total_count=CHOICE,
logits=[V1,V2,V3])
])
@tf.function
def mmnl_log_prob(a_train, a_car, b_cost, mu_b_time, sigma_b_time):
return mmnl_func().log_prob(
[mu_b_time, sigma_b_time, a_train, a_car,b_cost])
@tf.function
def affine(x, kernel_diag, bias=tf.zeros([])):
"""`kernel_diag * x + bias` with broadcasting."""
kernel_diag = tf.ones_like(x) * kernel_diag
bias = tf.ones_like(x) * bias
return x * kernel_diag + bias
Update
If I change:
logits=[V1,V2,V3]
to:
logits=tf.stack([V1,V2,V3],axis=1)
I get the below error/traceback. I am not sure how the inputs to log_prob() work and how they interact with the joint distribution function. It seems to be an error where I pass 5 inputs and it is looking for 10 (corresponding to the 10 dimensions of the joint distribution).
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-26-bb6078bf0f7c> in <module>()
40 return samples_nuts_, stats_nuts_
41
---> 42 samples_nuts, stats_nuts = nuts_sampler(initial_state)
8 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/func_graph.py in wrapper(*args, **kwargs)
966 except Exception as e: # pylint:disable=broad-except
967 if hasattr(e, "ag_error_metadata"):
--> 968 raise e.ag_error_metadata.to_exception(e)
969 else:
970 raise
ValueError: in user code:
<ipython-input-26-bb6078bf0f7c>:34 nuts_sampler *
samples_nuts_, stats_nuts_ = tfp.mcmc.sample_chain(
<ipython-input-25-39abca09aae1>:28 mmnl_log_prob *
[mu_b_time, sigma_b_time, a_train, a_car,b_cost])
/usr/local/lib/python3.6/dist-packages/tensorflow_probability/python/distributions/joint_distribution.py:443 log_prob **
return self._call_log_prob(value, **unmatched_kwargs)
/usr/local/lib/python3.6/dist-packages/tensorflow_probability/python/distributions/distribution.py:862 _call_log_prob
value = _convert_to_tensor(value, name='value', dtype_hint=self.dtype)
/usr/local/lib/python3.6/dist-packages/tensorflow_probability/python/distributions/distribution.py:172 _convert_to_tensor
check_types=False)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/nest.py:1118 map_structure_up_to
**kwargs)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/nest.py:1200 map_structure_with_tuple_paths_up_to
expand_composites=expand_composites)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/nest.py:835 assert_shallow_structure
input_length=len(input_tree), shallow_length=len(shallow_tree)))
ValueError: The two structures don't have the same sequence length. Input structure has length 5, while shallow structure has length 10.
来源:https://stackoverflow.com/questions/61236004/specification-of-multinomial-model-in-tensorflow-probability