Story Ending Generation with Incremental Encoding and Commonsense Knowledge

23 Sep 2019 in Studies on Deep Learning, Natural Language Processing, Knowledge Graph

WHY?

Commonsense knowledge graph can be useful source of explicit knowledge for generating texts that make sense. However, it is hard to use KG since it would hold huge amount of information than needed. Retrieving graphs which is relevent for the generation is the key.

Text Generation from Knowledge Graphs with Graph Transformers

19 Sep 2019 in Studies on Deep Learning, Natural Language Processing, Knowledge Graph

WHY?

Though knowledge graph can capture the essence of corpus, generating sentences based on the graph is difficult task. This paper tried to generate texts(paper abstracts) from KG in science(AI) domain.

COMET: Commonsense Transformers for Automatic Knowledge Graph Construction

16 Sep 2019 in Studies on Deep Learning, Natural Language Processing, Knowledge Graph

Commonsense Knowledge Graph

Knowledge graph is graph representation of knowledge. Entities are represented as nodes and relations between entities are represented as edges. Commonsense knowledge graph stores commonsense knowledge in form of graphs. Two of common dataset for commonsense knowledge graph are ATOMIC and ConceptNet.

Task-Oriented Query Reformulation with Reinforcement Learning

28 Feb 2019 in Studies on Deep Learning, Natural Language Processing

WHY?

Information retrieval from search engine becomes difficult when the query is incomplete or too complex. This paper suggests a query reformulation system that rewrite the query to maximize the probability of relevant documents returned.

BERT: Pretraining of Deep Bidirectional Transformers for Language Understanding

02 Jan 2019 in Studies on Deep Learning, Natural Language Processing

WHY?

Former Transformer was unidirectional language model.

On the Dimensionality of Word Embedding

16 Dec 2018 in Studies on Deep Learning, Natural Language Processing

WHY?

Dimension of word embedding is usually determined with heuristic.

Linguistic Regularities in Sparse and Explicit Word Representations

21 Nov 2018 in Studies on Deep Learning, Natural Language Processing

WHY?

Vector offset method is used for word analogy task.

Linguistic Regularities in Continuous Space Word Representations

21 Nov 2018 in Studies on Deep Learning, Natural Language Processing

WHY?

Vector space word representations capture syntactic and semantic regularities in language well.

Neural Word Embedding as Implicit Matrix Factorization

25 Oct 2018 in Studies on Deep Learning, Natural Language Processing

WHY?

Skip-Gram Negative Sampling(SGNS) showed amazing performance compared to traditional word embedding methods. However, it was not clear where SGNS converge to.

Dependency-Based Word Embeddings

24 Oct 2018 in Studies on Deep Learning, Natural Language Processing

WHY?

Traditional continuous word embeddings based on linear contexts. In other words, word embeddings considered only surrounding words as context.

Improving Distributional Similarity with Lessons Learned from Wrod Embeddings

01 Oct 2018 in Studies on Deep Learning, Natural Language Processing

WHY?

Word embedding using neural network(Skipgram) seems to outperform traditional count-based distributional model. However, this paper points out that current superiority of word2vec is not because of the algorithm itself, but because of system design choices and hyperparameter optimizations.

Breaking the Softmax Bottleneck: A High-Rank RNN Language Model

09 Aug 2018 in Studies on Deep Learning, Natural Language Processing

WHY?

This paper first prove that the expresiveness of a language model is restricted by softmax and suggest a way to overcome this limit.

Deep Contextualized Word Representations

26 Jul 2018 in Studies on Deep Learning, Natural Language Processing

WHY?

Former word representations such as Word2Vec or GloVe didn’t contained linguistic context.

A Character-Level Decoder without Explicit Segmentation for Neural Machine Translation

18 Jul 2018 in Studies on Deep Learning, Natural Language Processing

WHY?

All the previous neural machine translators are based on word-level translation. Word-level translators has critical problem of out-of-vocabulary error.

A Hierarchical Latent Variable Encoder-Decoder model for Generating Dialogues

06 Jul 2018 in Studies on Deep Learning, Natural Language Processing

WHY?

Hierarchical recurrent encoder-decoder model(HRED) that aims to capture hierarchical structure of sequential data tends to fail because model is encouraged to capture only local structure and LSTM often has vanishing gradient effect.

An Efficient Framework for Learning Sentence Representations

04 Jun 2018 in Studies on Deep Learning, Natural Language Processing

WHY?

Sentence를 vector representation으로 나타내는 것은 쉽지 않고 특히 Unlabeled data로 부터 만드는 것은 더욱 어렵다.

Skip RNN: Learning to Skip State Updates in Recurrent Neural Networks

16 May 2018 in Studies on Deep Learning, Natural Language Processing

WHY?

RNN계열의 sequence model들은 언어모델에 효과적이지만 추론이 느리고 gradient가 사라지거나 long-term dependency를 잡지 못하는 등의 문제점이 있다.

Generating Sentences from a Continuous Space

09 May 2018 in Studies on Deep Learning, Natural Language Processing

WHY?

기존의 언어모델(RNNLM)은 문장을 만들 때 한 단어씩 만들어야 하며 문장의 정보를 가지고 있는 latent representation의 정보를 활용하지 못한다.

Attention is all you need

21 Mar 2018 in Studies on Deep Learning, Natural Language Processing

WHY?

기존의 sequence transduction모델들은 복잡한 RNN이나 CNN을 기반으로 하였고 attention mechanism을 활용하였다.

Bi-Directional Attention Flow for Machine Comprehension

27 Feb 2018 in Studies on Deep Learning, Natural Language Processing

WHY?

기존 machine comprehension 모델들의 attention은 문맥의 조그마한 부분에 주목하여 문맥을 특정 길이의 벡터로 요약을 하고 어탠션을 단방향적으로, temporal하게 적용하였다. 이러한 기존의 attention 방법은 요약하는 과정에서 정보를 손실하기도 하고 순차적으로 이루어지는 attention간에 의존성이 나타나기 때문에 attention의 역할과 model의 역할이 섞이게 된다.