Mask R-CNN

Comparing with R-CNN, Fast R-CNN, and Faster R-CNN, which generate bounding boxes, not actual shapes of the objects, Mask R-CNN [59] extends the Faster R-CNN for image segmentation at the pixel level. In Mask R-CNN, a convolutional backbone architecture is used for feature extraction over an image, followed by a network head for the bounding-box recognition (classification and regression). A mask branch is added for mask prediction. The Mask R-CNN with the head architecture of the Feature Pyramid Network (FPN) backbone is shown in

Figure 2-46. Some results on the COCO (Common Objects in Context) dataset is shown in Figure 2-47.

Figure 2-47 Head architecture of mask R-CNN plus Faster R-CNN [59]

Figure 2-48 Keypoint detection results and predicted segmentation masks [59]

PreviousR-CNN, Fast R-CNN, and Faster R-CNN NextSSD and YOLO

Last updated 1 year ago