Linguistic Regularities in Continuous Space Word Representations
WHY?
Vector space word representations capture syntactic and semantic regularities in language well.
Vector space word representations capture syntactic and semantic regularities in language well.
Synchronization is an important issue is distributed SGD. Too few synchoronization among nodes causes unstable training while too frequent synchoronization causes high communication cost.
Skip-Gram Negative Sampling(SGNS) showed amazing performance compared to traditional word embedding methods. However, it was not clear where SGNS converge to.
Traditional continuous word embeddings based on linear contexts. In other words, word embeddings considered only surrounding words as context.
Models with huge number of parameters or huge amount of data do not fit in GPU memory of a machine.
Word embedding using neural network(Skipgram) seems to outperform traditional count-based distributional model. However, this paper points out that current superiority of word2vec is not because of the algorithm itself, but because of system design choices and hyperparameter optimizations.
in Studies on Deep Learning, Deep Learning
Bilinear model can caputure rich relation of two vectors. However, the computational complexity of bilinear model is huge due to its high dimensionality. To make bilinear model more applicable, this paper suggests low-rank bilinear pooling using Hadamard product.
in Studies on Deep Learning, Deep Learning
Visual question answering task is answering natural language questions based on images. To solve questions that require multi-step reasoning, stacked attention networks(SANs) stacks several layers of attention on parts of images based on query.