Reparameterization trick is a useful technique for estimating gradient for loss function with stochastic variables. While score function extimators suffer from great variance, RT enable the gradient to be estimated with pathwise derivatives. Even though reparameterization trick can be applied to various kinds of random variables enabling backpropagation, it has not been applicable to discrete random variables.
Reparameterization of discrete random variable is enabled by relaxing the condition of Categorical variables to Concrete variables(Continuous relaxations of discrete random variables). Concrete variable is motivated by Gumbel-Max trick. Gumbel distribution can be defined as . If we set for such k that maximize , then . This transforms the sampling of discrete random variable to deterministic transformation of uniform random variable.
However, argmax process of Gunbel-Max trick is not appropriate for backpropagation. Concrete distribution substitute the argmax process with softmax with temperatures. This approahes to argmax as . The probability distribution is as follows.
We can use this Concrete random varibale in computing gradient of discrete stochastic variables. By substituting discete random variables with concrete random variables, variational lowerbound can be relaxed as follows.
Concrete relaxation outperformed VIMCO in structured output prediction and density estimation with non-linear model.
Backpropagation of discrete stochastic random variables can be useful in other areas too. Maybe more various kinds of experiments would be nice.
Subscribe via RSS