Deep Contextualized Word Representations

26 Jul 2018 in Studies on Deep Learning, Natural Language Processing

WHY?

Former word representations such as Word2Vec or GloVe didn’t contained linguistic context.

Neural Variational Inference and Learning in Belief Networks

26 Jul 2018 in Studies on Deep Learning, Generative Models

WHY?

Directed latent variable models are known to be difficult to train at large scale because posterior distribution is intractable.

Neural Autoregressive Distribution Estimation

25 Jul 2018 in Studies on Deep Learning, Generative Models

WHY?

Estimating the distribution of data can help solving various predictive taskes. Approximately three approaches are available: Directed graphical models, undirected graphical models, and density estimation using autoregressive models and feed-forward neural network (NADE).

Glow: Generative Flow with Invertible 1x1 Convolutions

23 Jul 2018 in Studies on Deep Learning, Generative Models

WHY?

The motivation is almost the same as that of NICE, and RealNVP.

A Character-Level Decoder without Explicit Segmentation for Neural Machine Translation

18 Jul 2018 in Studies on Deep Learning, Natural Language Processing

WHY?

All the previous neural machine translators are based on word-level translation. Word-level translators has critical problem of out-of-vocabulary error.

Density Estimation Using Real NVP

18 Jul 2018 in Studies on Deep Learning, Generative Models

WHY?

The motivation is almost the same as that of NICE. This papaer suggest more elaborate transformation to represent complex data.

NICE: Non-linear Independent Components Estimation

17 Jul 2018 in Studies on Deep Learning, Generative Models

WHY?

Modeling data with known probability distribution has a lot of advantages. We can exactly calculate the log likelihood of the data and easily sample new data from distribution. However, finding tractable transformation of data into probability distribution or vice versa is difficult. For instance, a neural encoder is a common way to transform data but its log-likelihood is known to be intractable and another separately trained decoder is required to sample data.

Categorical Reparameterization with Gunbel-Softmax

17 Jul 2018 in Studies on Deep Learning, Deep Learning

WHY?

The same motivation with Concrete.

WHY?

WHY?

WHY?

WHY?

WHY?

WHY?

WHY?

WHY?

Pagination