Deformable Convolutional Networks
in Studies on Deep Learning, Computer Vision
WHY?
Spatial sampling of convolutional neural network is geometrically fixed. This paper suggests two modules for CNN to capture the geometric structure more flexibly.
in Studies on Deep Learning, Computer Vision
Spatial sampling of convolutional neural network is geometrically fixed. This paper suggests two modules for CNN to capture the geometric structure more flexibly.
in Studies on Deep Learning, Computer Vision
Image segmentation requires a lot of annotated images. This paper suggests efficient training of image segmentation using data augmentation and new structure.
in Studies on Deep Learning, Computer Vision
Representing bilinear relationship of two inputs is expensive. MLB efficiently reduced the number of parameters by substituting bilinear operation with Hadamard product operation. This paper extends this idea to capture bilinear attention between two multi-channel inputs.
in Studies on Deep Learning, Computer Vision
Object box proposal process is complicated and slow in object detection process. This paper proposes Single Shot Detector(SSD) to detect objects with single neural network.
in Studies on Deep Learning, Computer Vision
Neural modular networks do not generalize well to new questions since their performance rely on syntactic parser.
in Studies on Deep Learning, Computer Vision
Visual question answering task is compositional in nature.
in Studies on Deep Learning, Computer Vision
Segmenting objects in videos is difficult without manual labels.
in Studies on Deep Learning, Computer Vision
There are some architectures for relational reasoning but lacks general-purpose components for relational reasoning and visual question answering.
in Studies on Deep Learning, Computer Vision
Former CNN models fully activate(filly distributed features) for a single input showing poor performance on invariant relational reasoning.
in Studies on Deep Learning, Computer Vision
Hard attention is relatively less explored than soft attention.
in Studies on Deep Learning, Computer Vision
Relational Network showed great performance in relational reasoning, but calculations and memory consumption grow quadratically with the number of the objects due to fully connected pairing process.
in Studies on Deep Learning, Computer Vision
Feature learning in Convolution Neural Network requires many hand labeled data. It would be useful if one can use other form of supervision. In nature world, organisms acquire many essential information regarding vision by moving itself(egomotion).
in Studies on Deep Learning, Computer Vision
Former CNNs were not spatially invariant.
in Studies on Deep Learning, Computer Vision
기존 CNN의 문제점은 Max-pooling layer에서 feature의 대략적인 존재여부만 확인하고 정확한 공간정보를 버린다는 것이다. 이 때문에 특징이 어디에 존재하건 존재여부를 확인할 수 있는 invariance한 성질을 가지지만 그 특징이 다른 특징들과 전혀 조화를 이루지 못하더라도 이를 구별하지 못한다. 우리가 원하는 것은 특징의 단순한 존재여부 뿐만 아니라 전체적인 조화까지 고려하는 equivariance의 성질이다.
in Studies on Deep Learning, Computer Vision
기존의 화면에서 사물을 박스치는(Detection) Faster-RCNN에서 한 단계 더 나아가서 특정사물의 영역을 표시하는(Segmentation) 모델을 제안하였다.