0%

Reading Note: Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks

TITLE: Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks

AUTHER: Sean Bell, C. Lawrence Zitnick, Kavita Bala, Ross Girshick

ASSOCIATION: Cornell University, Microsoft Research

FROM: arXiv:1512.04143

CONTRIBUTIONS

  1. ION architecture is introduce that leverages context and multi-scale skip pooling for object detection. Use the information both inside and outside the ROI to determine the detection result.

METHOD

The main steps of the method is shown in the following figure.

  1. The image is first fed into a CNN, e.g.VGG16.
  2. ROI proposals are generated in the same way of Fast R-CNN.
  3. The information within the ROI are extracted by ROI pooling on different feature maps from different convolutional layers of different scales.
  4. The information outside the ROI are extracted by 2 successive 4-direction IRNNs. And ROI pooling is used to extract features.
  5. The pooled features are L2 nomalized and concated. Then a 1X1 conv layer is used to reduce the dimension.
  6. Two branches are learned to predict category and location.

some details

A 4-direction IRNN contains 4 independent IRNNs and each IRNN moves in different directions (left, right, up and down). The internal IRNN computations are splitted into separate logical layers. the input-to-hidden transition is implemented by a 1x1 convolution, and its computation can be shared across different directions.

ADVANTAGES

  1. The proposed detector works better on smaller objects compared with other works.
  2. Both local and global information are take into account.
  3. Skip pooling uses the informaiton of different scales.
  4. Two successive 4-direction IRNN cover the information form the whole image.