Deep Learning Travels
Don't panic

Neural Arithmetic Logit Units
WHY? Neural network was poor at manipulating numerical information outside the range of training set. WHAT? This paper suggests two models that learn to manipulate and extrapolate numbers. The first model is the neural accumulator(NAC) which accumulate the quantities in rows additively. This model is a relaxed version of linear...

VAE with a VampPrior
WHY? Choosing an appropriate prior is important for VAE. This paper suggests twolayered VAE with flexible VampPrior. WHAT? The original variational lowerbound of VAE can be decomposed as follows. The first component is the negative reconstruction error, the second component is the expectation of the entropy of the variational posterior,...

Large Scale GAN Training for High Fidelity Natural Image Synthesis
WHY? Generating a high resolution image with GAN is difficult despite of recent advances. This paper suggests BigGAN which adds few tricks on previous model to generate large scale images without progressively growing the network. WHAT? BigGAN is made by a series of tricks over baseline model. SelfAttention GAN(SAGAN) which...

Bilinear Attention Networks
WHY? Representing bilinear relationship of two inputs is expensive. MLB efficiently reduced the number of parameters by substituting bilinear operation with Hadamard product operation. This paper extends this idea to capture bilinear attention between two multichannel inputs. WHAT? Using the lowrank bilinear pooling, attention on visual inputs given a question...

Hybrid computing using a neural network with dynamic external memory
WHY? Using external memory as modern computer enable neural net the use of extensible memory. This paper suggests Differentible Neural Computer(DNC) which is an advanced version of Neural Turing Machine. WHAT? Reading and writing in DNC are implemented with differentiable attention mechanism. The controller of DNC is an variant of...