• Hybrid computing using a neural network with dynamic external memory

    WHY? Using external memory as modern computer enable neural net the use of extensible memory. This paper suggests Differentible Neural Computer(DNC) which is an advanced version of Neural Turing Machine. WHAT? Reading and writing in DNC are implemented with differentiable attention mechanism. The controller of DNC is an variant of...


  • SSD: Single Shot MultiBox Detector

    WHY? Object box proposal process is complicated and slow in object detection process. This paper proposes Single Shot Detector(SSD) to detect objects with single neural network. WHAT? SSD prodeces fixed-size collection of bounding boxes and scores the presence of class objects in the boxes. The front of SSD is standard...


  • Progressive Growing of GANs for improved Quality, Stability, and Variation

    WHY? Training GAN on high-resolution images is known to be difficult. WHAT? This paper suggests new method of training GAN to train progressively from coarse to fine scale. A pair of generator and discriminator are trained with low scale real and fake images at first. As input image size grows,...


  • BERT: Pretraining of Deep Bidirectional Transformers for Language Understanding

    WHY? Former Transformer was unidirectional language model. WHAT? BERT is multi-layer bidirectional Transformer encoder. A sequence of input representation can be either a single text sentence or a pair of text sentences. The first token of every sequence is a classification token and sentences in a sequence are separated by...


  • On the Dimensionality of Word Embedding

    WHY? Dimension of word embedding is usually determined with heuristic. WHAT? This paper suggests new metric to evaluate the quality of word embedding and uses it to find its optimal dimensionality. Word embedding algorithms are shown to converge to implicitly factorized PMI matrix. However, L2 loss of embedding and factorized...