How to create an caffemodel file from training image and its labeled?

15,408

To get a caffemodel you need to train the network. That prototxt file is only to deploy the model and cannot be used to train it.

You need to add a data layer that points to your database. To use a list of files as you mention, the source of the layer should be HDF5. You will probably want to add a transform_param with the mean value. The image files can be replaced by a LMDB or LevelDB database for efficiency purposes.

At the end of the network you will have to substitute the 'prob' layer with a 'loss' layer. Something like this:

layers { name: "loss" type: SoftmaxWithLoss bottom: "fc8" top: "loss" }

The layer catalogue can be found here:

http://caffe.berkeleyvision.org/tutorial/layers.html

Or, as your network is a well known one... just look at this tutorial :P.

http://caffe.berkeleyvision.org/gathered/examples/imagenet.html

The correct prototxt file for training is included in caffe ('train_val.prototxt').

Share:
15,408
Jame
Author by

Jame

Updated on May 30, 2022

Comments

  • Jame
    Jame almost 2 years

    I am working in age classification based on the opensource at here The python code has

    age_net_pretrained='./age_net.caffemodel'
    age_net_model_file='./deploy_age.prototxt'
    age_net = caffe.Classifier(age_net_model_file, age_net_pretrained,
           channel_swap=(2,1,0),
           raw_scale=255,
           image_dims=(256, 256))
    

    In which .prototxt file is shown as below. I remain one file that is ".caffemodel". As the source code, he provided it before. However, I would like to create it again based on my face database. Could you have any tutorial or some way to create it? I assume that I have a folder image that include 100 images and divided belongs to each age groups (1 to 1) such as

    image1.png 1
    image2.png 1
    ..
    image10.png 1
    image11.png 2
    image12.png 2
    ...
    image100.png 10
    

    This is prototxt file. Thanks in advance

    name: "CaffeNet"
    input: "data"
    input_dim: 1
    input_dim: 3
    input_dim: 227
    input_dim: 227
    layers {
      name: "conv1"
      type: CONVOLUTION
      bottom: "data"
      top: "conv1"
      convolution_param {
        num_output: 96
        kernel_size: 7
        stride: 4
      }
    }
    layers {
      name: "relu1"
      type: RELU
      bottom: "conv1"
      top: "conv1"
    }
    layers {
      name: "pool1"
      type: POOLING
      bottom: "conv1"
      top: "pool1"
      pooling_param {
        pool: MAX
        kernel_size: 3
        stride: 2
      }
    }
    layers {
      name: "norm1"
      type: LRN
      bottom: "pool1"
      top: "norm1"
      lrn_param {
        local_size: 5
        alpha: 0.0001
        beta: 0.75
      }
    }
    layers {
      name: "conv2"
      type: CONVOLUTION
      bottom: "norm1"
      top: "conv2"
      convolution_param {
        num_output: 256
        pad: 2
        kernel_size: 5
      }
    }
    layers {
      name: "relu2"
      type: RELU
      bottom: "conv2"
      top: "conv2"
    }
    layers {
      name: "pool2"
      type: POOLING
      bottom: "conv2"
      top: "pool2"
      pooling_param {
        pool: MAX
        kernel_size: 3
        stride: 2
      }
    }
    layers {
      name: "norm2"
      type: LRN
      bottom: "pool2"
      top: "norm2"
      lrn_param {
        local_size: 5
        alpha: 0.0001
        beta: 0.75
      }
    }
    layers {
      name: "conv3"
      type: CONVOLUTION
      bottom: "norm2"
      top: "conv3"
      convolution_param {
        num_output: 384
        pad: 1
        kernel_size: 3
      }
    }
    layers{
      name: "relu3" 
      type: RELU
      bottom: "conv3"
      top: "conv3"
    }
    layers {
      name: "pool5"
      type: POOLING
      bottom: "conv3"
      top: "pool5"
      pooling_param {
        pool: MAX
        kernel_size: 3
        stride: 2
      }
    }
    layers {
      name: "fc6"
      type: INNER_PRODUCT
      bottom: "pool5"
      top: "fc6"
      inner_product_param {
        num_output: 512
      }
    }
    layers {
      name: "relu6"
      type: RELU
      bottom: "fc6"
      top: "fc6"
    }
    layers {
      name: "drop6"
      type: DROPOUT
      bottom: "fc6"
      top: "fc6"
      dropout_param {
        dropout_ratio: 0.5
      }
    }
    layers {
      name: "fc7"
      type: INNER_PRODUCT
      bottom: "fc6"
      top: "fc7"
      inner_product_param {
        num_output: 512
      }
    }
    layers {
      name: "relu7"
      type: RELU
      bottom: "fc7"
      top: "fc7"
    }
    layers {
      name: "drop7"
      type: DROPOUT
      bottom: "fc7"
      top: "fc7"
      dropout_param {
        dropout_ratio: 0.5
      }
    }
    layers {
      name: "fc8"
      type: INNER_PRODUCT
      bottom: "fc7"
      top: "fc8"
      inner_product_param {
        num_output: 8
      }
    }
    layers {
      name: "prob"
      type: SOFTMAX
      bottom: "fc8"
      top: "prob"
    }
    
  • omatai
    omatai over 6 years
    You say that this prototxt file is only to deploy the model. Is that because the first input_dim specified is 1, and to train you need to specify a reasonable batch size?
  • MacKa
    MacKa about 6 years
    Hi, I'm new to this area. I would like to create my own .caffemodel and convert it to CoreML. How would I do that? Please guide me.