• Deep Generative Image Models using a Laplacian Pyramid of Adversarial Network

    WHY? GAN had troble modeling the entire image. Note Laplacian Pyramid Framework is used for restoring compressed image. When a image is compressed to smaller size, it would lose information of high-resolution image so that simply enlarging the image would not be enough to restore original data. Laplacian pyramid framework...


  • Breaking the Softmax Bottleneck: A High-Rank RNN Language Model

    WHY? This paper first prove that the expresiveness of a language model is restricted by softmax and suggest a way to overcome this limit. WHAT? The last part of language models usuallt consist of softmax layer applied on a product of context vector(h) and a word embedding w. This paper...


  • Eigenvalues of the Hessian in Deep Learning: Singularity and Beyond

    WHY? Gradient descent methods depend on the first order gradient of a loss function wrt parameters. However, the second order gradient(Hessian) is often neglected. WHAT? This paper explored exact Hessian prodect of neural network (after convergence) and discovered that the eigenvalue of Hessian is separated into two groups: 0s and...


  • Variational Inference for Monte Carlo Objectives

    WHY? Recent variational training requires sampling of the variational posterior to estimate gradient. NVIL estimator suggest a method to estimate the gradient of the loss function wrt parameters. Since score function estimator is known to have high variance, baseline is used as variance reduction technique. However, this technique is insufficient...


  • [Pytorch] MADE

    Pytorch implementation of MADE: Masked Autoencoder for Distribution Estimation. https://github.com/Lyusungwon/generative_models_pytorch Reference https://github.com/karpathy/pytorch-made Note Autoregressive sampling was tricky Results Config model: 180817182411_made_1000_200_0.001_28_28_1000_2_1_False epochs 1000 batch-size 200 lr 1e-3 hidden-size 1000 layer-size 2 mask-num 1 start-sample 394 random-order False Test loss Samples Original - Reconstruction - Inpainting input - Inpainting output