Skimming SphereReID: Deep Hypersphere Manifold Embedding for Person Re-identication [1], this paper proposes an adjusted Softmax, namely Sphere Loss, which enables network learning Hypersphere Manifold Embedding into space to improve discrimination ability.
Pain points
Generally, softmax Loss and Triplet Loss are commonly used, as shown in the figure below
It can be seen that the problem of Softmax Loss or ID Loss is that there is no restriction on the distribution of feature space, so the learned feature mapping may not be optimal. However, the features of each dimension of Triplet loss are only within a very small interval, and the target embedding space may not be fully utilized.
In order to introduce feature normalization and weight normalization, classification will only rely on the Angle between the embedding vector and the weight vector of the target class after eliminating different norms, which brings a clear geometric interpretation in the embedding space. In addition, the classification method supervised by Softmax was caused by sample size deviation, which ultimately led to performance decline. As shown in Figure 1(c), embedding vectors are distributed in a hyperspherical manifold. Unlike Euclidean space embedding, SphereReID’s greatest feature is that the image is mapped to the surface of the hypersphere, which limits the possible spatial distribution to a limited angular space. Therefore, the target embedding space can be fully utilized to train the network to classify images from different pedestrians.
model
Sphere Loss limits the distribution of sample embedding on hyperspherical manifolds. The diagram below:
Wherein, the green arrow W1 and w2 represent the center weight vectors of two different classes, and the yellow arrow represents the embedded feature vectors. It can be seen that Softmax is unevenly distributed, while Sphere Loss constraining them to the circle (hypersphere).
Remember that softmax formula is as follows:
The process of multiplying the weight of each neuron in the last FC layer is shown as follows:
Softmax’s decision boundaries are as follows:
In contrast, sphere Loss w and X are regularized as follows:
Sphere Loss performs L2 regularization on W and X to eliminate the influence of norm and Angle discrimination characteristics.
The decision boundary diagram of Softmax Loss and Sphere Loss is as follows:
Decision-making boundary of Sphere Loss:
The formula of Sphere Loss is as follows:
S is the temperature constant, which is 14 in the experiment
The experiment
Experimental backbone is Resnet50. The separation test and comparison experiment with SoftMax Loss are as follows. It can be seen that D network has the best effect, and Sphere Loss is generally better than SoftMax Loss:
Where network A is the last layer of global Average pooling, network B is the last layer of global Avgpooling + FC layer, network C is the global Avgpooling + FC layer + BN, D network is global Avgpooling + BN + Dropout + FC layer + BN.
The separation test of Droput ratio is as follows:
SOTA comparison on Market1501:
SOTA comparison in SUhK-Sysu:
SOTA comparison in DukemtMC-Reid:
SOTA comparison in CUHK03:
reference
[1] Fan X, Jiang W, Luo H, et al. Spherereid: Deep hypersphere manifold embedding for person re-identification[J]. Journal of Visual Communication and Image Representation, 2019, 60: 51-58.