Massachusetts (Massive) Visualization Dataset
Understanding the cognitive and perceptual processing of a visualization is essential for effective data presentation and communication to the viewer. The MASSVIS Database was constructed to gain deeper insight into the elements of a visualization that affect its memorability, recognition, recall, and comprehension. This is one of the largest real-world visualization databases, scraped from various online publication venues including government reports, infographic blogs, news media websites, and scientific journals. The diversity and distribution of these visualizations represents data visualizations "in the wild". In addition to providing insights about visual encoding techniques and designs utilized by the different publication venues, this database is also a resource for cognitive and perceptual experiments.MASSVIS consists of over 5000 static visualizations of which over 2000 contain visualization type information, and hundreds of these visualizations have extensive annotations, memorability scores, eye-movements, and labels.We have eye-movement data for a total of 393 visualizations and 33 viewers, with an average of 16 viewers per visualization. Each viewer looked at each visualization for 10 seconds, generating an average of 37 fixation points. This is a total of about 600 fixation points per visualization across all viewers. We store the (x,y) location of each fixation on a visualization, the time-point when the fixation occurred during the viewing period, and the duration (in ms) of each fixation. We provide tools for visualizing the fixation sequences, fixation durations, and fixation heatmaps on top of visualizations.