• Neural Arithmetic Logit Units

    WHY? Neural network was poor at manipulating numerical information outside the range of training set. WHAT? This paper suggests two models that learn to manipulate and extrapolate numbers. The first model is the neural accumulator(NAC) which accumulate the quantities in rows additively. This model is a relaxed version of linear...


  • VAE with a VampPrior

    WHY? Choosing an appropriate prior is important for VAE. This paper suggests two-layered VAE with flexible VampPrior. WHAT? The original variational lower-bound of VAE can be decomposed as follows. The first component is the negative reconstruction error, the second component is the expectation of the entropy of the variational posterior,...


  • Large Scale GAN Training for High Fidelity Natural Image Synthesis

    WHY? Generating a high resolution image with GAN is difficult despite of recent advances. This paper suggests BigGAN which adds few tricks on previous model to generate large scale images without progressively growing the network. WHAT? BigGAN is made by a series of tricks over baseline model. Self-Attention GAN(SA-GAN) which...


  • Bilinear Attention Networks

    WHY? Representing bilinear relationship of two inputs is expensive. MLB efficiently reduced the number of parameters by substituting bilinear operation with Hadamard product operation. This paper extends this idea to capture bilinear attention between two multi-channel inputs. WHAT? Using the low-rank bilinear pooling, attention on visual inputs given a question...


  • Hybrid computing using a neural network with dynamic external memory

    WHY? Using external memory as modern computer enable neural net the use of extensible memory. This paper suggests Differentible Neural Computer(DNC) which is an advanced version of Neural Turing Machine. WHAT? Reading and writing in DNC are implemented with differentiable attention mechanism. The controller of DNC is an variant of...