zoukankan html css js c++ java

ImageNet Classification-darknet

ImageNet Classification

You can use Darknet to classify images for the 1000-class ImageNet challenge. If you haven't installed Darknet yet, you should do that first.

Classifying With Pre-Trained Models

Here are the commands to install Darknet, download a classification weights file, and run a classifier on an image:

git clone https://github.com/pjreddie/darknet.git
cd darknet
make
wget https://pjreddie.com/media/files/darknet19.weights
./darknet classifier predict cfg/imagenet1k.data cfg/darknet19.cfg darknet19.weights data/dog.jpg

This example uses the Darknet19 model, you can read more about it below. After running this command you should see the following output:

layer     filters    size              input                output
    0 conv     32  3 x 3 / 1   256 x 256 x   3   ->   256 x 256 x  32  0.113 BFLOPs
    1 max          2 x 2 / 2   256 x 256 x  32   ->   128 x 128 x  32
    2 conv     64  3 x 3 / 1   128 x 128 x  32   ->   128 x 128 x  64  0.604 BFLOPs
    3 max          2 x 2 / 2   128 x 128 x  64   ->    64 x  64 x  64
    4 conv    128  3 x 3 / 1    64 x  64 x  64   ->    64 x  64 x 128  0.604 BFLOPs
    5 conv     64  1 x 1 / 1    64 x  64 x 128   ->    64 x  64 x  64  0.067 BFLOPs
    6 conv    128  3 x 3 / 1    64 x  64 x  64   ->    64 x  64 x 128  0.604 BFLOPs
    7 max          2 x 2 / 2    64 x  64 x 128   ->    32 x  32 x 128
    8 conv    256  3 x 3 / 1    32 x  32 x 128   ->    32 x  32 x 256  0.604 BFLOPs
    9 conv    128  1 x 1 / 1    32 x  32 x 256   ->    32 x  32 x 128  0.067 BFLOPs
   10 conv    256  3 x 3 / 1    32 x  32 x 128   ->    32 x  32 x 256  0.604 BFLOPs
   11 max          2 x 2 / 2    32 x  32 x 256   ->    16 x  16 x 256
   12 conv    512  3 x 3 / 1    16 x  16 x 256   ->    16 x  16 x 512  0.604 BFLOPs
   13 conv    256  1 x 1 / 1    16 x  16 x 512   ->    16 x  16 x 256  0.067 BFLOPs
   14 conv    512  3 x 3 / 1    16 x  16 x 256   ->    16 x  16 x 512  0.604 BFLOPs
   15 conv    256  1 x 1 / 1    16 x  16 x 512   ->    16 x  16 x 256  0.067 BFLOPs
   16 conv    512  3 x 3 / 1    16 x  16 x 256   ->    16 x  16 x 512  0.604 BFLOPs
   17 max          2 x 2 / 2    16 x  16 x 512   ->     8 x   8 x 512
   18 conv   1024  3 x 3 / 1     8 x   8 x 512   ->     8 x   8 x1024  0.604 BFLOPs
   19 conv    512  1 x 1 / 1     8 x   8 x1024   ->     8 x   8 x 512  0.067 BFLOPs
   20 conv   1024  3 x 3 / 1     8 x   8 x 512   ->     8 x   8 x1024  0.604 BFLOPs
   21 conv    512  1 x 1 / 1     8 x   8 x1024   ->     8 x   8 x 512  0.067 BFLOPs
   22 conv   1024  3 x 3 / 1     8 x   8 x 512   ->     8 x   8 x1024  0.604 BFLOPs
   23 conv   1000  1 x 1 / 1     8 x   8 x1024   ->     8 x   8 x1000  0.131 BFLOPs
   24 avg                        8 x   8 x1000   ->  1000
   25 softmax                                        1000
Loading weights from darknet19.weights...Done!
data/dog.jpg: Predicted in 0.769246 seconds.
42.55%: malamute
22.93%: Eskimo dog
12.51%: Siberian husky
 2.76%: bicycle-built-for-two
 1.20%: mountain bike

Darknet displays information as it loads the config file and weights, then it classifies the image and prints the top-10 classes for the image. Kelp is a mixed breed dog but she has a lot of malamute in her so we'll consider this a success!

You can also try with other images, like the bald eagle image:

./darknet classifier predict cfg/imagenet1k.data cfg/darknet19.cfg darknet19.weights data/eagle.jpg

Which produces:

...
data/eagle.jpg: Predicted in 0.707070 seconds.
84.68%: bald eagle
11.91%: kite
 2.62%: vulture
 0.08%: great grey owl
 0.07%: hen

Pretty good!

If you don't specify an image file you will be prompted at run-time for an image. This way you can classify multiple in a row without reloading the whole model. Use the command:

./darknet classifier predict cfg/imagenet1k.data cfg/darknet19.cfg darknet19.weights

Then you will get a prompt that looks like:

....
25: Softmax Layer: 1000 inputs
Loading weights from darknet19.weights...Done!
Enter Image Path:

Whenever you get bored of classifying images you can use Ctrl-C to exit the program.

Validating On ImageNet

You see these validation set numbers thrown around everywhere. Maybe you want to double check for yourself how well these models actually work. Let's do it!

First you need to download the validation images, and the cls-loc annotations. You can get them here but you'll have to make an account! Once you download everything you should have a directory with ILSVRC2012_bbox_val_v3.tgz, and ILSVRC2012_img_val.tar. First we unpack them:

tar -xzf ILSVRC2012_bbox_val_v3.tgz
mkdir -p imgs && tar xf ILSVRC2012_img_val.tar -C imgs

Now we have the images and the annotations but we need to label the images so Darknet can evaluate its predictions. We do that using this bash script. It's already in your scripts/ subdirectory. We can just get it again though and run it:

wget https://pjreddie.com/media/files/imagenet_label.sh
bash imagenet_label.sh

This will generate two things: a directory called labelled/ which contains renamed symbolic links to the images, and a file called inet.val.list which contains a list of the paths of the labelled images. We need to move this file to the data/ subdirectory in Darknet:

mv inet.val.list <path-to>/darknet/data

Now you are finally ready to validate your model! First re-make Darknet. Then run the validation routine like so:

./darknet classifier valid cfg/imagenet1k.data cfg/darknet19.cfg darknet19.weights

Note: if you don't compile Darknet with OpenCV then you won't be able to load all of the ImageNet images since some of them are weird formats not supported by stb_image.h.

If you don't compile with CUDA you can still validate on ImageNet but it will take like a reallllllly long time. Not recommended.

Pre-Trained Models

Here are a variety of pre-trained models for ImageNet classification. Accuracy is measured as single-crop validation accuracy on ImageNet. GPU timing is measured on a Titan X, CPU timing on an Intel i7-4790K (4 GHz) run on a single core. Using multi-threading with OPENMP should scale linearly with # of CPUs.

Model	Top-1	Top-5	Ops	GPU	CPU	Cfg	Weights
AlexNet	57.0	80.3	2.27 Bn	3.1 ms	0.29 s	cfg	238 MB
Darknet Reference	61.1	83.0	0.96 Bn	2.9 ms	0.14 s	cfg	28 MB
VGG-16	70.5	90.0	30.94 Bn	9.4 ms	4.36 s	cfg	528 MB
Extraction	72.5	90.8	8.52 Bn	4.8 ms	0.97 s	cfg	90 MB
Darknet19	72.9	91.2	7.29 Bn	6.2 ms	0.87 s	cfg	80 MB
Darknet19 448x448	76.4	93.5	22.33 Bn	11.0 ms	2.96 s	cfg	80 MB
Resnet 18	70.7	89.9	4.69 Bn	4.6 ms	0.57 s	cfg	44 MB
Resnet 34	72.4	91.1	9.52 Bn	7.1 ms	1.11 s	cfg	83 MB
Resnet 50	75.8	92.9	9.74 Bn	11.4 ms	1.13 s	cfg	87 MB
Resnet 101	77.1	93.7	19.70 Bn	20.0 ms	2.23 s	cfg	160 MB
Resnet 152	77.6	93.8	29.39 Bn	28.6 ms	3.31 s	cfg	220 MB
ResNeXt 50	77.8	94.2	10.11 Bn	24.2 ms	1.20 s	cfg	220 MB
ResNeXt 101 (32x4d)	77.7	94.1	18.92 Bn	58.7 ms	2.24 s	cfg	159 MB
ResNeXt 152 (32x4d)	77.6	94.1	28.20 Bn	73.8 ms	3.31 s	cfg	217 MB
Densenet 201	77.0	93.7	10.85 Bn	32.6 ms	1.38 s	cfg	66 MB
Darknet53	77.2	93.8	18.57 Bn	13.7 ms	2.11 s	cfg	159 MB
Darknet53 448x448	78.5	94.7	56.87 Bn	26.3 ms	7.21 s	cfg	159 MB

AlexNet

The model that started a revolution! The original model was crazy with the split GPU thing so this is the model from some follow-up work.

Top-1 Accuracy: 57.0%
Top-5 Accuracy: 80.3%
Forward Timing: 3.1 ms/img
CPU Forward Timing: 0.29 s/img
cfg file
weight file (238 MB)

Darknet Reference Model

This model is designed to be small but powerful. It attains the same top-1 and top-5 performance as AlexNet but with 1/10th the parameters. It uses mostly convolutional layers without the large fully connected layers at the end. It is about twice as fast as AlexNet on CPU making it more suitable for some vision applications.

Top-1 Accuracy: 61.1%
Top-5 Accuracy: 83.0%
Forward Timing: 2.9 ms/img
CPU Forward Timing: 0.14 s/img
cfg file
weight file (28 MB)

VGG-16

The Visual Geometry Group at Oxford developed the VGG-16 model for the ILSVRC-2014 competition. It is highly accurate and widely used for classification and detection. I adapted this version from the Caffe pre-trained model. It was trained for an additional 6 epochs to adjust to Darknet-specific image preprocessing (instead of mean subtraction Darknet adjusts images to fall between -1 and 1).

Top-1 Accuracy: 70.5%
Top-5 Accuracy: 90.0%
Forward Timing: 9.4 ms/img
CPU Forward Timing: 4.36 s/img
cfg file
weight file (528 MB)

Extraction

I developed this model as an offshoot of the GoogleNet model. It doesn't use the "inception" modules, only 1x1 and 3x3 convolutional layers.

Top-1 Accuracy: 72.5%
Top-5 Accuracy: 90.8%
Forward Timing: 4.8 ms/img
CPU Forward Timing: 0.97 s/img
cfg file
weight file (90 MB)

Darknet19

I modified the Extraction network to be faster and more accurate. This network was sort of a merging of ideas from the Darknet Reference network and Extraction as well as numerous publications like Network In Network, Inception, and Batch Normalization.

Top-1 Accuracy: 72.9%
Top-5 Accuracy: 91.2%
Forward Timing: 6.2 ms/img
CPU Forward Timing: 0.87 s/img
cfg file
weight file (80 MB)

Darknet19 448x448

I trained Darknet19 for 10 more epochs with a larger input image size, 448x448. This model performs significantly better but is slower since the whole image is larger.

Top-1 Accuracy: 76.4%
Top-5 Accuracy: 93.5%
Forward Timing: 11.0 ms/img
CPU Forward Timing: 2.96 s/img
cfg file
weight file (80 MB)

Resnet 50

For some reason people love these networks even though they are so sloooooow. Whatever. Paper

Top-1 Accuracy: 75.8%
Top-5 Accuracy: 92.9%
Forward Timing: 11.4 ms/img
CPU Forward Timing: 1.13 s/img
cfg file
weight file (87 MB)

Resnet 152

For some reason people love these networks even though they are so sloooooow. Whatever. Paper

Top-1 Accuracy: 77.6%
Top-5 Accuracy: 93.8%
Forward Timing: 28.6 ms/img
CPU Forward Timing: 3.31 s/img
cfg file
weight file (220 MB)

Densenet 201

I love DenseNets! They are just so deep and so crazy and work so well. Like Resnet, still slow since they are sooooo many layers but at least they work really well! Paper

Top-1 Accuracy: 77.0%
Top-5 Accuracy: 93.7%
Forward Timing: 32.6 ms/img
CPU Forward Timing: 1.38 s/img
cfg file
weight file (67 MB)

ImageNet Classification

You can use Darknet to classify images for the 1000-class ImageNet challenge. If you haven't installed Darknet yet, you should do that first.

Classifying With Pre-Trained Models

Here are the commands to install Darknet, download a classification weights file, and run a classifier on an image:

git clone https://github.com/pjreddie/darknet.git
cd darknet
make
wget https://pjreddie.com/media/files/darknet19.weights
./darknet classifier predict cfg/imagenet1k.data cfg/darknet19.cfg darknet19.weights data/dog.jpg

This example uses the Darknet19 model, you can read more about it below. After running this command you should see the following output:

layer     filters    size              input                output
    0 conv     32  3 x 3 / 1   256 x 256 x   3   ->   256 x 256 x  32  0.113 BFLOPs
    1 max          2 x 2 / 2   256 x 256 x  32   ->   128 x 128 x  32
    2 conv     64  3 x 3 / 1   128 x 128 x  32   ->   128 x 128 x  64  0.604 BFLOPs
    3 max          2 x 2 / 2   128 x 128 x  64   ->    64 x  64 x  64
    4 conv    128  3 x 3 / 1    64 x  64 x  64   ->    64 x  64 x 128  0.604 BFLOPs
    5 conv     64  1 x 1 / 1    64 x  64 x 128   ->    64 x  64 x  64  0.067 BFLOPs
    6 conv    128  3 x 3 / 1    64 x  64 x  64   ->    64 x  64 x 128  0.604 BFLOPs
    7 max          2 x 2 / 2    64 x  64 x 128   ->    32 x  32 x 128
    8 conv    256  3 x 3 / 1    32 x  32 x 128   ->    32 x  32 x 256  0.604 BFLOPs
    9 conv    128  1 x 1 / 1    32 x  32 x 256   ->    32 x  32 x 128  0.067 BFLOPs
   10 conv    256  3 x 3 / 1    32 x  32 x 128   ->    32 x  32 x 256  0.604 BFLOPs
   11 max          2 x 2 / 2    32 x  32 x 256   ->    16 x  16 x 256
   12 conv    512  3 x 3 / 1    16 x  16 x 256   ->    16 x  16 x 512  0.604 BFLOPs
   13 conv    256  1 x 1 / 1    16 x  16 x 512   ->    16 x  16 x 256  0.067 BFLOPs
   14 conv    512  3 x 3 / 1    16 x  16 x 256   ->    16 x  16 x 512  0.604 BFLOPs
   15 conv    256  1 x 1 / 1    16 x  16 x 512   ->    16 x  16 x 256  0.067 BFLOPs
   16 conv    512  3 x 3 / 1    16 x  16 x 256   ->    16 x  16 x 512  0.604 BFLOPs
   17 max          2 x 2 / 2    16 x  16 x 512   ->     8 x   8 x 512
   18 conv   1024  3 x 3 / 1     8 x   8 x 512   ->     8 x   8 x1024  0.604 BFLOPs
   19 conv    512  1 x 1 / 1     8 x   8 x1024   ->     8 x   8 x 512  0.067 BFLOPs
   20 conv   1024  3 x 3 / 1     8 x   8 x 512   ->     8 x   8 x1024  0.604 BFLOPs
   21 conv    512  1 x 1 / 1     8 x   8 x1024   ->     8 x   8 x 512  0.067 BFLOPs
   22 conv   1024  3 x 3 / 1     8 x   8 x 512   ->     8 x   8 x1024  0.604 BFLOPs
   23 conv   1000  1 x 1 / 1     8 x   8 x1024   ->     8 x   8 x1000  0.131 BFLOPs
   24 avg                        8 x   8 x1000   ->  1000
   25 softmax                                        1000
Loading weights from darknet19.weights...Done!
data/dog.jpg: Predicted in 0.769246 seconds.
42.55%: malamute
22.93%: Eskimo dog
12.51%: Siberian husky
 2.76%: bicycle-built-for-two
 1.20%: mountain bike

You can also try with other images, like the bald eagle image:

./darknet classifier predict cfg/imagenet1k.data cfg/darknet19.cfg darknet19.weights data/eagle.jpg

Which produces:

...
data/eagle.jpg: Predicted in 0.707070 seconds.
84.68%: bald eagle
11.91%: kite
 2.62%: vulture
 0.08%: great grey owl
 0.07%: hen

Pretty good!

If you don't specify an image file you will be prompted at run-time for an image. This way you can classify multiple in a row without reloading the whole model. Use the command:

./darknet classifier predict cfg/imagenet1k.data cfg/darknet19.cfg darknet19.weights

Then you will get a prompt that looks like:

....
25: Softmax Layer: 1000 inputs
Loading weights from darknet19.weights...Done!
Enter Image Path:

Whenever you get bored of classifying images you can use Ctrl-C to exit the program.

Validating On ImageNet

You see these validation set numbers thrown around everywhere. Maybe you want to double check for yourself how well these models actually work. Let's do it!

tar -xzf ILSVRC2012_bbox_val_v3.tgz
mkdir -p imgs && tar xf ILSVRC2012_img_val.tar -C imgs

wget https://pjreddie.com/media/files/imagenet_label.sh
bash imagenet_label.sh

mv inet.val.list <path-to>/darknet/data

Now you are finally ready to validate your model! First re-make Darknet. Then run the validation routine like so:

./darknet classifier valid cfg/imagenet1k.data cfg/darknet19.cfg darknet19.weights

Note: if you don't compile Darknet with OpenCV then you won't be able to load all of the ImageNet images since some of them are weird formats not supported by stb_image.h.

If you don't compile with CUDA you can still validate on ImageNet but it will take like a reallllllly long time. Not recommended.

Pre-Trained Models

Model	Top-1	Top-5	Ops	GPU	CPU	Cfg	Weights
AlexNet	57.0	80.3	2.27 Bn	3.1 ms	0.29 s	cfg	238 MB
Darknet Reference	61.1	83.0	0.96 Bn	2.9 ms	0.14 s	cfg	28 MB
VGG-16	70.5	90.0	30.94 Bn	9.4 ms	4.36 s	cfg	528 MB
Extraction	72.5	90.8	8.52 Bn	4.8 ms	0.97 s	cfg	90 MB
Darknet19	72.9	91.2	7.29 Bn	6.2 ms	0.87 s	cfg	80 MB
Darknet19 448x448	76.4	93.5	22.33 Bn	11.0 ms	2.96 s	cfg	80 MB
Resnet 18	70.7	89.9	4.69 Bn	4.6 ms	0.57 s	cfg	44 MB
Resnet 34	72.4	91.1	9.52 Bn	7.1 ms	1.11 s	cfg	83 MB
Resnet 50	75.8	92.9	9.74 Bn	11.4 ms	1.13 s	cfg	87 MB
Resnet 101	77.1	93.7	19.70 Bn	20.0 ms	2.23 s	cfg	160 MB
Resnet 152	77.6	93.8	29.39 Bn	28.6 ms	3.31 s	cfg	220 MB
ResNeXt 50	77.8	94.2	10.11 Bn	24.2 ms	1.20 s	cfg	220 MB
ResNeXt 101 (32x4d)	77.7	94.1	18.92 Bn	58.7 ms	2.24 s	cfg	159 MB
ResNeXt 152 (32x4d)	77.6	94.1	28.20 Bn	73.8 ms	3.31 s	cfg	217 MB
Densenet 201	77.0	93.7	10.85 Bn	32.6 ms	1.38 s	cfg	66 MB
Darknet53	77.2	93.8	18.57 Bn	13.7 ms	2.11 s	cfg	159 MB
Darknet53 448x448	78.5	94.7	56.87 Bn	26.3 ms	7.21 s	cfg	159 MB

AlexNet

The model that started a revolution! The original model was crazy with the split GPU thing so this is the model from some follow-up work.

Top-1 Accuracy: 57.0%
Top-5 Accuracy: 80.3%
Forward Timing: 3.1 ms/img
CPU Forward Timing: 0.29 s/img
cfg file
weight file (238 MB)

Darknet Reference Model

Top-1 Accuracy: 61.1%
Top-5 Accuracy: 83.0%
Forward Timing: 2.9 ms/img
CPU Forward Timing: 0.14 s/img
cfg file
weight file (28 MB)

VGG-16

Top-1 Accuracy: 70.5%
Top-5 Accuracy: 90.0%
Forward Timing: 9.4 ms/img
CPU Forward Timing: 4.36 s/img
cfg file
weight file (528 MB)

Extraction

I developed this model as an offshoot of the GoogleNet model. It doesn't use the "inception" modules, only 1x1 and 3x3 convolutional layers.

Top-1 Accuracy: 72.5%
Top-5 Accuracy: 90.8%
Forward Timing: 4.8 ms/img
CPU Forward Timing: 0.97 s/img
cfg file
weight file (90 MB)

Darknet19

Top-1 Accuracy: 72.9%
Top-5 Accuracy: 91.2%
Forward Timing: 6.2 ms/img
CPU Forward Timing: 0.87 s/img
cfg file
weight file (80 MB)

Darknet19 448x448

I trained Darknet19 for 10 more epochs with a larger input image size, 448x448. This model performs significantly better but is slower since the whole image is larger.

Top-1 Accuracy: 76.4%
Top-5 Accuracy: 93.5%
Forward Timing: 11.0 ms/img
CPU Forward Timing: 2.96 s/img
cfg file
weight file (80 MB)

Resnet 50

For some reason people love these networks even though they are so sloooooow. Whatever. Paper

Top-1 Accuracy: 75.8%
Top-5 Accuracy: 92.9%
Forward Timing: 11.4 ms/img
CPU Forward Timing: 1.13 s/img
cfg file
weight file (87 MB)

Resnet 152

For some reason people love these networks even though they are so sloooooow. Whatever. Paper

Top-1 Accuracy: 77.6%
Top-5 Accuracy: 93.8%
Forward Timing: 28.6 ms/img
CPU Forward Timing: 3.31 s/img
cfg file
weight file (220 MB)

Densenet 201

I love DenseNets! They are just so deep and so crazy and work so well. Like Resnet, still slow since they are sooooo many layers but at least they work really well! Paper

Top-1 Accuracy: 77.0%
Top-5 Accuracy: 93.7%
Forward Timing: 32.6 ms/img
CPU Forward Timing: 1.38 s/img
cfg file
weight file (67 MB)

查看全文

相关阅读:
VS 2015 GIT操作使用说明
 态度以及业余付出决定程序生涯
 Magicodes.WeiChat——使用OAuth 2.0获取微信用户信息
 Magicodes.WeiChat——ASP.NET Scaffolding生成增删改查、分页、搜索、删除确认、批量操作、批量删除等业务代码
 Magicodes.WeiChat——利用纷纭打造云日志频道
 Magicodes.WeiChat——自定义knockoutjs template、component实现微信自定义菜单
 产品管理之敏捷之路（一）——携手同行，走自己的敏捷之路
 Magicodes.NET框架之路——V0.0.0.5 Beta版发布
 Spring mvc中@RequestMapping 6个基本用法小结
 spring.net aop 讲解

原文地址：https://www.cnblogs.com/lvdongjie/p/14176565.html