• Neural Variational Inference and Learning in Belief Networks

    WHY? Directed latent variable models are known to be difficult to train at large scale because posterior distribution is intractable. WHAT? This paper suggests way to estimate inference model with feed-forward network. Since exact posterior is intractable, we use to approximate. Since h is sampled from posterior, it is impossible...


  • Neural Autoregressive Distribution Estimation

    WHY? Estimating the distribution of data can help solving various predictive taskes. Approximately three approaches are available: Directed graphical models, undirected graphical models, and density estimation using autoregressive models and feed-forward neural network (NADE). WHAT? Autoregressive generative model of data consists of D conditionals. NADE use feed-forward neural network to...


  • Glow: Generative Flow with Invertible 1x1 Convolutions

    WHY? The motivation is almost the same as that of NICE, and RealNVP. WHAT? The architecture of generative flow(Glow) is almost the same as multi-scale architecture of RealNVP. A step of flow of Glow uses act-norm instead of batch-norm and uses invertible 1x1 convolution instead of reverse ordering. Act-norm performs...


  • [Pytorch] MUNIT

    Pytorch implementation of Multimodal Unsupervised Image-to-Image Translation. https://github.com/Lyusungwon/munit_pytorch Reference https://github.com/NVlabs/MUNIT Note My impression on this paper differed a lot from the first time when I read this paper. 8+ models took all of my memory, so I had to train with batch size < 4. 8 latent variables vs 256...


  • A Character-Level Decoder without Explicit Segmentation for Neural Machine Translation

    WHY? All the previous neural machine translators are based on word-level translation. Word-level translators has critical problem of out-of-vocabulary error. WHAT? This paper suggest Bi-Scale recurrent neural network with attention to model character level translator. Input are encoded into BPE. Slower layer carries information of word and faster layer carries...