A Large-Scale Dataset for Cross Modal Learning on Human Action Understanding
MMAct is a new large-scale multi modal dataset for action understanding with the largest modalities 7 MODALITIES: RGB, Keypoints, Acceleration, Gyroscope, Orientation, Wi-Fi, Pressure. 1900+ VIDEOS: Untrimmed videos with 1920x1080@30FPS 36K INSTANCES: Average length is in a range from 3-8 seconds. 37 CLASSES: Daily, Abnormal, Desk work actions 4 SCENES: Free space, Occlusion, Station Entrance, Desk work. 4 + 1 VIEWS: 4 survillence views + 1 egocentric view 20 SUBJECTS: 10 female, 10 male RANDOMNESS: Deployed under a semi-naturalistic collection protocol.