SHPED contains annotated poses in stereo videos
We provide a dataset of stereo image pairs suited for stereo human pose estimation of upper-body people. SHPED consists of 630 stereo image pairs (i.e. 1260 images) classified into 42 video clips of 15 frames each. The clips have been extracted from 26 stereo videos, obtained from YouTube with the tag yt3d:enable = true. In addition, SHPED contains 1470 stickman upper-body annotations corresponding to 49 persons according these conditions: up-right position, all upper-body parts almost visible, and non-profile viewpoint of the body. Furthermore, we include a plane projective transformation in every clip for rectifying and detections (bounding boxes) of each person along the sequence. The stereo image pairs are in a wide range of variations in appearance, clothing, human pose, illumination, image quality, baseline separation of the cameras, and/or background.