How to create caffe.deploy from train.prototxt

10,676

There are two main differences between a "train" prototxt and a "deploy" one:

1. Inputs: While for training data is fixed to a pre-processed training dataset (lmdb/HDF5 etc.), deploying the net require it to process other inputs in a more "random" fashion.
Therefore, the first change is to remove the input layers (layers that push "data" and "labels" during TRAIN and TEST phases). To replace the input layers you need to add the following declaration:

input: "data"
input_shape: { dim:1 dim:3 dim:224 dim:224 }

This declaration does not provide the actual data for the net, but it tells the net what shape to expect, allowing caffe to pre-allocate necessary resources.

2. Loss: the top most layers in a training prototxt define the loss function for the training. This usually involve the ground truth labels. When deploying the net, you no longer have access to these labels. Thus loss layers should be converted to "prediction" outputs. For example, a "SoftmaxWithLoss" layer should be converted to a simple "Softmax" layer that outputs class probability instead of log-likelihood loss. Some other loss layers already have predictions as inputs, thus it is sufficient just to remove them.

Update: see this tutorial for more information.

Share:
10,676
Carlos Porta
Author by

Carlos Porta

Updated on July 13, 2022

Comments

  • Carlos Porta
    Carlos Porta almost 2 years

    This is my train.prototxt. And this is my deploy.prototxt.

    When I want to load my deploy file I get this error:

    File "./python/caffe/classifier.py", line 29, in __init__  
    in_ = self.inputs[0]  
    IndexError: list index out of range  
    

    So, I removed the data layer:

    F1117 23:16:09.485153 21910 insert_splits.cpp:35] Unknown bottom blob 'data' (layer 'conv1', bottom index 0)
    *** Check failure stack trace: ***
    

    Than, I removed bottom: "data" from conv1 layer.

    After it, I got this error:

    F1117 23:17:15.363919 21935 insert_splits.cpp:35] Unknown bottom blob 'label' (layer 'loss', bottom index 1)
    *** Check failure stack trace: ***
    

    I removed bottom: "label" from loss layer. And I got this error:

    I1117 23:19:11.171021 21962 layer_factory.hpp:76] Creating layer conv1
    I1117 23:19:11.171036 21962 net.cpp:110] Creating Layer conv1
    I1117 23:19:11.171041 21962 net.cpp:433] conv1 -> conv1
    F1117 23:19:11.171061 21962 layer.hpp:379] Check failed: MinBottomBlobs() <= bottom.size() (1 vs. 0) Convolution Layer takes at least 1 bottom blob(s) as input.
    *** Check failure stack trace: ***
    

    What should I do to fix it and create my deploy file?

  • Shai
    Shai about 8 years
    @0x1337 in order to define the shape of the input "data" we use 'BlobShape' proto message. This shape has a "repeated" parameter dim that defines one dimension of the shape. dim:1 means we expect 'data' , at deploy stage, to include only one sample at a time (i.e., batch_size:1).
  • Shai
    Shai almost 8 years
    it depends on the way your dropout layer is implemented. In some cases you need to replace the dropout with a scaling to compensate for the increase energy of the signal that is not dropped: For instance if you drop 50%, then in test time you have x2 more signal strength passing through the layer, you need to scale down the output by 50%.
  • cdeepakroy
    cdeepakroy over 7 years
    @Shai Thanks for a clear explanation. Is there a way to programmatically generate the deploy.prototxt in python similar? More details about my question are here -- stackoverflow.com/questions/40986009/…
  • Farid Alijani
    Farid Alijani about 4 years
    @Shai, what difference does it make to consider larger batch_size during test? Does it have any effect on accuracy for testing? or it is only for training process?
  • Shai
    Shai about 4 years
    @FäridAlijani it has no effect on test time accuracy
  • Farid Alijani
    Farid Alijani about 4 years
    @Shai, so I have net.blobs[net.inputs[0]].reshape(batch_sz, ch, height, width) # n,C,H,W -> 1,C,H,W do I always assume batch_size = 1 is the easiest w.r.t memory consumption and time efficiency?