Answerer in Questioner's Mind: Information Theoretic Approach to Goal-Oriented Visual Dialog

WHY?

Goal-oriented dialogue tasks require two agents(a questioner and an answerer) to communicate to solve the task. Previous supervised learning or reinforcement learning approaches struggled to make appropriate question due to the complexity of forming a sentence. This paper suggests information theoretic approach to solve this task.

Continue reading

Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding

WHY?

Former methods used element-wise sum, product or concatenation to represent the relation of two vectors. Bilinear model(outer prodct) of two vectors is more sophisticated way of representing relation, but usually dimensionality become too big. This paper suggests multimodal compact bilinear pooling(MCB) to represent compact and sophisticated relations.

Continue reading

Pagination


© 2017. by isme2n

Powered by aiden