Skip to content
This repository has been archived by the owner on Aug 5, 2022. It is now read-only.

SSD: Single Shot MultiBox Detector

Sylwester edited this page Feb 20, 2017 · 18 revisions

Introduction

This page covers tips and usage guide for SSD in Intel® Distribution of Caffe*. It should give you a brief introduction to help you run training, inference and scoring. The original project website is https://github.com/weiliu89/caffe/tree/ssd. Please note that it is not 100% identical.

Important Note: SSD currently does not work with MKL2017 or MKL-DNN engine on all layers. The engines are already specified per layer in prototxt. Do not specify default engine when using SSD as that would overwrite those. It will change soon when we will have support for SSD-specific concat and dilated convolution layers. Multi-node also does not fully support SSD yet.

Changes compared to the original SSD project

Supported topologies

Python scripts used for generation of network defining prototxt files were removed and we do not support them. They could still be used if modified correctly, but they do not generate the optimized models and were only working for gpu by default.

You can find the predefined SSD topologies optimized for intel architecture based on AlexNet and VGG in $CAFFE_ROOT/examples/ssd. Please note, that AlexNet might not give the state of the art accuracy. We do not provide accuracy scores. Please refer to the original paper.

Setting up

Before you start using ssd you might need to specify environment variables which can be found in examples/ssd/ssdvars.sh. Please change it to point to your specific locations and import the variables by calling:

source examples/ssd/ssdvars.sh
  1. Download fully convolutional reduced (atrous) VGGNet from original SSD project website and extract .caffemodel only. By default, we assume the model is stored in $CAFFE_ROOT/examples/ssd/VGGNet/.

Datasets

You can download datasets as instructed in the original SSD project github site.

VOC example

LMDB creation

download data and extract it

#### Download the data.
cd $DATAPATH/data
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
# Extract the data.
tar -xvf VOCtrainval_11-May-2012.tar
tar -xvf VOCtrainval_06-Nov-2007.tar
tar -xvf VOCtest_06-Nov-2007.tar

Create the LMDB file

# Go to your caffe directory
cd $CAFFE_ROOT
# Create the trainval.txt, test.txt, and test_name_size.txt in data/VOC0712/
./data/VOC0712/create_list.sh
# You can modify the parameters in create_data.sh if needed.
# It will create lmdb files for trainval and test with encoded original image:
# - $DATAPATH/data/VOCdevkit/VOC0712/lmdb/VOC0712_trainval_lmdb
# - $DATAPATH/data/VOCdevkit/VOC0712/lmdb/VOC0712_test_lmdb
# and make soft links at examples/VOC0712/
./data/VOC0712/create_data.sh

Training

Train your model and evaluate

./build/tools/caffe train -solver examples/ssd/VGGNet/VOC0712/SSD_300x300/solver.prototxt \
-weights examples/ssd/VGGNet/VGG_ILSVRC_16_layers_fc_reduced.caffemodel

It should reach 77.* mAP at 120k iterations according to the author's reports. (Make sure you have completed the Setting up step.)

Scoring

All you need to do to run scoring(assuming you have LMDB) and you have the trained model you want to evaluate. You can download one (extract ONLY the .caffemodel) from links at the bottom of original SSD github project page, and put the *1200000.caffemodel file inside: examples/ssd/VGGNet/VOC0712/SSD_300x300/ Run the line below and wait patiently for the result:

caffe test --detection --weights=<weights file> --model=<model file> --iterations=<no of iters>

Switch --detection turns on SSD. By default it's false so when the switch does not appear in the command, classification is executed. The output should be like below, where detection_eval = XXXX is the mAP score:

I1219 11:22:03.161818 71583 caffe.cpp:155] Finetuning from models/VGGNet/VOC0712/SSD_300x300/VGG_VOC0712_SSD_300x300_iter_120000.caffemodel
I1219 11:22:03.588759 71583 net.cpp:761] Ignoring source layer mbox_loss
I1219 11:22:03.592058 71583 caffe.cpp:251] Starting Optimization
I1219 11:22:03.592092 71583 solver.cpp:294] Solving VGG_VOC0712_SSD_300x300_train
I1219 11:22:03.592097 71583 solver.cpp:295] Learning Rate Policy: multistep
I1219 11:22:04.485777 71583 solver.cpp:332] Iteration 0, loss = 1.46778
I1219 11:22:04.485929 71583 solver.cpp:433] Iteration 0, Testing net (#0)
I1219 11:22:04.491395 71583 net.cpp:693] Ignoring source layer mbox_loss
I1219 12:01:24.939699 71583 solver.cpp:546] Test net output #0: detection_eval = 0.776861
I1219 12:01:24.939939 71583 solver.cpp:337] Optimization Done.
I1219 12:01:24.939947 71583 caffe.cpp:254] Optimization Done.

Inference example based on VOC

Download models from original SSD webiste. Modify the paths in deploy.prototxt file to point to your output directory (i.e. $DATAPATH/data/VOCdevkit/VOC0712/results/VOC2007/SSD_300x300/Main) Run the below lines:

./build/examples/ssd/ssd_detect examples/ssd/VGGNet/VOC0712/SSD_300x300/deploy.prototxt \
examples/ssd/VGGNet/VOC0712/SSD_300x300/VGG_VOC0712_SSD_300x300_iter_120000.caffemodel \
examples/ssd/images.txt -out_file detected.txt

python ./tools/extra/plot_detections.py \
--labelmap-file ./data/VOC0712/labelmap_voc.prototxt \
detected.txt . --save-dir . --visualize-threshold 0.2

See the result - in caffe directory you will have a file fish-bike.jpg.png. You should see a person and a bicycle detected.

COCO example

LMDB creation

Follow instructions in $CAFFE_ROOT/data/coco/README.md They are quite complex.

Training

You should have cafemodel from Setting up section. You can run it the same way as VOC just specify different path to solver for coco.

Scoring

Download COCO model from original SSD website. Run it the same way as VOC. You should get mAP@0.5 = 0.430362. To test mAP@0.75 instead of default mAP@0.5 change the overlap_threshold value in the prototxt for the MultiBoxLoss layer. The score should be for mAP@75 = 0.276951


*Other names and brands may be claimed as the property of others

Clone this wiki locally