Multi-pose human body detection
Multi-pose human body detection has many important applications in practice, e.g, in human gesture estimation, one often needs to detect the position of the human body first to provide a reference location for other human parts, such as head, hands, feet, and so on. One of the most studied topics related to this is the pedestrian detection. But pedestrian detection by itself concerns mainly about the human body in upright positions, while people in general will not always be upright - they can be bending, sitting, lying, or in other poses, highlighting the need for detecting human body under arbitrary poses becomes necessary.For the study of object detection, there are many public data sets for researchers to use such as Pascal VOC 2007, VOC 2010, ILSVRC 2010, and ILSVCR 2012. However, these do not focus on the category of the human being among many other categories of objects. Actually, there currently does not exist a large-scale data set tailored for the task of human body detection. Fortunately, there have been many datasets built for pose estimation like FLIC, LSP  and MPII Human Pose . Although there are many annotations of the locations of different body parts in these datasets, they all lack the annotation of bounding boxes about the whole human body, and such bounding boxes needed for model training and performance evaluation can not be reliably derived from the available annotations of body parts. Therefore, these data sets cannot be directly used for human body detection. To this end, we annotate a new dataset named LSP/MPII-MPHB (Multiple Poses Human Body) for human body detection, by selecting over 26K challenging images in LSP and MPII Human Pose and annotating human body bounding boxes on each of the selected images.The resulting dataset, named LSP/MPII-MPHB, contains 26,675 images and 29,732 human bodies. There is at least one human body per image, and some may contain multiple people. Among these images, 2,000 are from LSP and 24,675 are from MPII Human Body. We compute the size ratio of the ground-truth bounding box to the whole image and count the frequency histogram, as shown in Fig. 2. One can see that almost 70% ground-truth’s size ratio is less than 10%, indicating that it is challenging to detect the human beings in the MPHB data set.