188 lines
5.8 KiB
Markdown
188 lines
5.8 KiB
Markdown
---
|
|
title: Siamese Network Tutorial
|
|
description: Train and test a siamese network on MNIST data.
|
|
category: example
|
|
include_in_docs: true
|
|
layout: default
|
|
priority: 100
|
|
---
|
|
|
|
# Siamese Network Training with Caffe
|
|
This example shows how you can use weight sharing and a contrastive loss
|
|
function to learn a model using a siamese network in Caffe.
|
|
|
|
We will assume that you have caffe successfully compiled. If not, please refer
|
|
to the [Installation page](../../installation.html). This example builds on the
|
|
[MNIST tutorial](mnist.html) so it would be a good idea to read that before
|
|
continuing.
|
|
|
|
*The guide specifies all paths and assumes all commands are executed from the
|
|
root caffe directory*
|
|
|
|
## Prepare Datasets
|
|
|
|
You will first need to download and convert the data from the MNIST
|
|
website. To do this, simply run the following commands:
|
|
|
|
./data/mnist/get_mnist.sh
|
|
./examples/siamese/create_mnist_siamese.sh
|
|
|
|
After running the script there should be two datasets,
|
|
`./examples/siamese/mnist_siamese_train_leveldb`, and
|
|
`./examples/siamese/mnist_siamese_test_leveldb`.
|
|
|
|
## The Model
|
|
First, we will define the model that we want to train using the siamese network.
|
|
We will use the convolutional net defined in
|
|
`./examples/siamese/mnist_siamese.prototxt`. This model is almost
|
|
exactly the same as the [LeNet model](mnist.html), the only difference is that
|
|
we have replaced the top layers that produced probabilities over the 10 digit
|
|
classes with a linear "feature" layer that produces a 2 dimensional vector.
|
|
|
|
layer {
|
|
name: "feat"
|
|
type: "InnerProduct"
|
|
bottom: "ip2"
|
|
top: "feat"
|
|
param {
|
|
name: "feat_w"
|
|
lr_mult: 1
|
|
}
|
|
param {
|
|
name: "feat_b"
|
|
lr_mult: 2
|
|
}
|
|
inner_product_param {
|
|
num_output: 2
|
|
}
|
|
}
|
|
|
|
## Define the Siamese Network
|
|
|
|
In this section we will define the siamese network used for training. The
|
|
resulting network is defined in
|
|
`./examples/siamese/mnist_siamese_train_test.prototxt`.
|
|
|
|
### Reading in the Pair Data
|
|
|
|
We start with a data layer that reads from the LevelDB database we created
|
|
earlier. Each entry in this database contains the image data for a pair of
|
|
images (`pair_data`) and a binary label saying if they belong to the same class
|
|
or different classes (`sim`).
|
|
|
|
layer {
|
|
name: "pair_data"
|
|
type: "Data"
|
|
top: "pair_data"
|
|
top: "sim"
|
|
include { phase: TRAIN }
|
|
transform_param {
|
|
scale: 0.00390625
|
|
}
|
|
data_param {
|
|
source: "examples/siamese/mnist_siamese_train_leveldb"
|
|
batch_size: 64
|
|
}
|
|
}
|
|
|
|
In order to pack a pair of images into the same blob in the database we pack one
|
|
image per channel. We want to be able to work with these two images separately,
|
|
so we add a slice layer after the data layer. This takes the `pair_data` and
|
|
slices it along the channel dimension so that we have a single image in `data`
|
|
and its paired image in `data_p.`
|
|
|
|
layer {
|
|
name: "slice_pair"
|
|
type: "Slice"
|
|
bottom: "pair_data"
|
|
top: "data"
|
|
top: "data_p"
|
|
slice_param {
|
|
slice_dim: 1
|
|
slice_point: 1
|
|
}
|
|
}
|
|
|
|
### Building the First Side of the Siamese Net
|
|
|
|
Now we can specify the first side of the siamese net. This side operates on
|
|
`data` and produces `feat`. Starting from the net in
|
|
`./examples/siamese/mnist_siamese.prototxt` we add default weight fillers. Then
|
|
we name the parameters of the convolutional and inner product layers. Naming the
|
|
parameters allows Caffe to share the parameters between layers on both sides of
|
|
the siamese net. In the definition this looks like:
|
|
|
|
...
|
|
param { name: "conv1_w" ... }
|
|
param { name: "conv1_b" ... }
|
|
...
|
|
param { name: "conv2_w" ... }
|
|
param { name: "conv2_b" ... }
|
|
...
|
|
param { name: "ip1_w" ... }
|
|
param { name: "ip1_b" ... }
|
|
...
|
|
param { name: "ip2_w" ... }
|
|
param { name: "ip2_b" ... }
|
|
...
|
|
|
|
### Building the Second Side of the Siamese Net
|
|
|
|
Now we need to create the second path that operates on `data_p` and produces
|
|
`feat_p`. This path is exactly the same as the first. So we can just copy and
|
|
paste it. Then we change the name of each layer, input, and output by appending
|
|
`_p` to differentiate the "paired" layers from the originals.
|
|
|
|
### Adding the Contrastive Loss Function
|
|
|
|
To train the network we will optimize a contrastive loss function proposed in:
|
|
Raia Hadsell, Sumit Chopra, and Yann LeCun "Dimensionality Reduction by Learning
|
|
an Invariant Mapping". This loss function encourages matching pairs to be close
|
|
together in feature space while pushing non-matching pairs apart. This cost
|
|
function is implemented with the `CONTRASTIVE_LOSS` layer:
|
|
|
|
layer {
|
|
name: "loss"
|
|
type: "ContrastiveLoss"
|
|
contrastive_loss_param {
|
|
margin: 1.0
|
|
}
|
|
bottom: "feat"
|
|
bottom: "feat_p"
|
|
bottom: "sim"
|
|
top: "loss"
|
|
}
|
|
|
|
## Define the Solver
|
|
|
|
Nothing special needs to be done to the solver besides pointing it at the
|
|
correct model file. The solver is defined in
|
|
`./examples/siamese/mnist_siamese_solver.prototxt`.
|
|
|
|
## Training and Testing the Model
|
|
|
|
Training the model is simple after you have written the network definition
|
|
protobuf and solver protobuf files. Simply run
|
|
`./examples/siamese/train_mnist_siamese.sh`:
|
|
|
|
./examples/siamese/train_mnist_siamese.sh
|
|
|
|
# Plotting the results
|
|
|
|
First, we can draw the model and siamese networks by running the following
|
|
commands that draw the DAGs defined in the .prototxt files:
|
|
|
|
./python/draw_net.py \
|
|
./examples/siamese/mnist_siamese.prototxt \
|
|
./examples/siamese/mnist_siamese.png
|
|
|
|
./python/draw_net.py \
|
|
./examples/siamese/mnist_siamese_train_test.prototxt \
|
|
./examples/siamese/mnist_siamese_train_test.png
|
|
|
|
Second, we can load the learned model and plot the features using the iPython
|
|
notebook:
|
|
|
|
ipython notebook ./examples/siamese/mnist_siamese.ipynb
|
|
|