Dataset for monocular 3D human performance capture
MonoPerfCap dataset is meant for evaluating monocular performance capture approaches in a variety of scenarios. The dataset consists of 13 sequences (around 40k frames in total), which are split into the following subsets: 1) 8 video sequences at 30Hz covering a variety of different scenarios including indoor and outdoor settings, handheld and static cameras, natural and man-made environments, male and female subjects, as well as body-tight and loose garments. 2) To further increase the diversity of human motions of the benchmark dataset, an additional 40 actions, including daily actions such as walking, jumping as well as highly challenging ones such as rolling, kicking and falling are included. Each action is repeated multiple times by 3 subjects. In total, this leads to 120 video clips in 3 long video sequences, 7 minutes each. 3) Additionally, two sequences from prior works [Robertini et al. 2016] and [Wu et al. 2013] are included in the benchmark dataset. These two sequences provide accurate surface reconstruction from multiview images, which can be used as ground truth for quantitative evaluation.