TITLE: SuperCNN: A Superpixelwise Convolutional Neural Network for Salient Object Detection
AUTHER: Shengfeng He, Rynson W.H. Lau, Wenxi Liu, Zhe Huang, Qingxiong Yang Reid, Ian
FROM: IJCV2015
CONTRIBUTIONS
- A novel superpixel-wise convolutional neural network approach is proposed.
- Two kinds of sequence code are designed as input feature to CNN.
METHOD
- Superpixels are extracted via some methods such as oversegmentation.
- Extract Color Uniqueness Sequences (CU) for each superpixels to describe the color contrast between regions.
- Extract Color Distribution Sequences (CD) for each superpixels to measure the color compactness of colors.
- The two sequences are fed into a CNN to generate two saliency maps.
- A regressor is used to merge the two predicted saliency maps
SOME DETAILS
Color Uniqueness Sequence is used to describe the color contrast of a Region. Given an image $I$ and the superpixels or regions \(R=\lbrace r{1},…,r{x},…,r{N} \rbrace\), each region \(r{x}\) contains a color uniqueness sequence $Q{x}^{C} = \lbrace q{1}^{c},…,q{j}^{c},…,q{N}^{c} \rbrace$. Each element, \(q_{x}^{c}\) is defined as
where \(t(r{j})\) counts the total number of pixels in region \(r{j}\). \(\vert C(r{x})-C(r{j}) \vert\) is a 3D vector storing the absolute differences of each color channel. \(P(r{x}\) is the mean position of region \(r{j}\) and \(w(P(r{x}),P(r{j}))\) is defined as
The sequence \(Q{x}^{C}\) is sorted by the spatial distance to region \(r{x}\).
Color Distribution Sequence is a sequence \(Q{x}^{D} = \lbrace q{1}^{d},…,q{j}^{d},…,q{N}^{d} \rbrace\) with the element \(q_{j}^{d}\) defined as:
where
the sequence is also sorted by the spatial distance.
Network Structure is briefly illustrated as below:
Saliency Inference is first to get the \(N\) predicted saliency scores of the \(N\) regions. Because of the two kinds of sequences, two sets of scores \(S{1} and S{2}\) are predicted. The final saliency map can be obtained by:
ADVANTAGES
- It is fast when infering.
- Large context are encoded in the sequences.
DISADVANTAGES
- The CNN is of a very light-weight structure. Deeper network may provide better performance.
- As the sequences are used to describe contrast information, which may lead to failure with the foreground and background having similar colors.