Deformable Convolutional Networks

18 Jan 2019 in Studies on Deep Learning, Computer Vision

WHY?

Spatial sampling of convolutional neural network is geometrically fixed. This paper suggests two modules for CNN to capture the geometric structure more flexibly.

U-net: Convolutional Networks for Biomedical Image Segmentation

14 Jan 2019 in Studies on Deep Learning, Computer Vision

WHY?

Image segmentation requires a lot of annotated images. This paper suggests efficient training of image segmentation using data augmentation and new structure.

Bilinear Attention Networks

07 Jan 2019 in Studies on Deep Learning, Computer Vision

WHY?

Representing bilinear relationship of two inputs is expensive. MLB efficiently reduced the number of parameters by substituting bilinear operation with Hadamard product operation. This paper extends this idea to capture bilinear attention between two multi-channel inputs.

SSD: Single Shot MultiBox Detector

03 Jan 2019 in Studies on Deep Learning, Computer Vision

WHY?

Object box proposal process is complicated and slow in object detection process. This paper proposes Single Shot Detector(SSD) to detect objects with single neural network.

Inferring and Executing Programs for Visual Reasoning

15 Dec 2018 in Studies on Deep Learning, Computer Vision

WHY?

Neural modular networks do not generalize well to new questions since their performance rely on syntactic parser.

Deep Compositional Question Answering with Neural Module Networks

15 Dec 2018 in Studies on Deep Learning, Computer Vision

WHY?

Visual question answering task is compositional in nature.

Tracking Emerges by Colorizing Videos

13 Dec 2018 in Studies on Deep Learning, Computer Vision

WHY?

Segmenting objects in videos is difficult without manual labels.

FiLM: Visual Reasoning with a General Conditioning Layer

01 Dec 2018 in Studies on Deep Learning, Computer Vision

WHY?

There are some architectures for relational reasoning but lacks general-purpose components for relational reasoning and visual question answering.

Modularity Matters: Learning Invariant Relational Reasoning Tasks

26 Nov 2018 in Studies on Deep Learning, Computer Vision

WHY?

Former CNN models fully activate(filly distributed features) for a single input showing poor performance on invariant relational reasoning.

Learning Visual Question Answering by Bootstrapping Hard Attention

23 Nov 2018 in Studies on Deep Learning, Computer Vision

WHY?

Hard attention is relatively less explored than soft attention.

Relationships from Entity Stream

21 Nov 2018 in Studies on Deep Learning, Computer Vision

WHY?

Relational Network showed great performance in relational reasoning, but calculations and memory consumption grow quadratically with the number of the objects due to fully connected pairing process.

Learning to See by Moving

28 Jun 2018 in Studies on Deep Learning, Computer Vision

WHY?

Feature learning in Convolution Neural Network requires many hand labeled data. It would be useful if one can use other form of supervision. In nature world, organisms acquire many essential information regarding vision by moving itself(egomotion).

Spatial Transformer Networks

13 Jun 2018 in Studies on Deep Learning, Computer Vision

WHY?

Former CNNs were not spatially invariant.

Dynamic Routing Between Capsules

20 Feb 2018 in Studies on Deep Learning, Computer Vision

WHY?

기존 CNN의 문제점은 Max-pooling layer에서 feature의 대략적인 존재여부만 확인하고 정확한 공간정보를 버린다는 것이다. 이 때문에 특징이 어디에 존재하건 존재여부를 확인할 수 있는 invariance한 성질을 가지지만 그 특징이 다른 특징들과 전혀 조화를 이루지 못하더라도 이를 구별하지 못한다. 우리가 원하는 것은 특징의 단순한 존재여부 뿐만 아니라 전체적인 조화까지 고려하는 equivariance의 성질이다.

Mask R-CNN

17 Feb 2018 in Studies on Deep Learning, Computer Vision

WHY?

기존의 화면에서 사물을 박스치는(Detection) Faster-RCNN에서 한 단계 더 나아가서 특정사물의 영역을 표시하는(Segmentation) 모델을 제안하였다.

WHY?

WHY?

WHY?

WHY?

WHY?

WHY?

WHY?

WHY?

WHY?

WHY?

WHY?

WHY?

WHY?

WHY?

WHY?

Pagination