Fast R-CNN
Understanding why it’s 213 Times Faster than R-CNN and More Accurate
In 2013, Ross Girshick et al. introduced R-CNN, an object detection model that combines convolutional layers with existing computer vision techniques, breaking previous records. It was a groundbreaking model at the time. In 2015, Ross Girshick developed Fast R-CNN, setting a new record. It was more accurate, and the inference speed became 213 times faster. Of course, we need to know what they were comparing. So, this article examines the results published in the paper to understand how Fast R-CNN became that fast.
If you are not familiar with R-CNN, please read the previous article first so that this article makes more sense.
R-CNN Slowness Reasons
In the original R-CNN paper, Ross Girshick explained that R-CNN is more accurate than OverFeat (Yann LeCun et al.) and then pointed out that R-CNN was nine times slower than OverFeat. So, he wanted to make R-CNN faster.
Speeding up R-CNN should be possible in a variety of ways and remains as future work.
Source: paper
However, the below figure from the paper shows that the pipeline is rather complex.