Back

TextVQA

A dataset to benchmark visual reasoning based on text in images

TextVQA

TextVQA requires models to read and reason about text in images to answer questions about them. Specifically, models need to incorporate a new modality of text present in the images and reason over it to answer TextVQA questions.Statistics 28,408 images from OpenImages 45,336 questions 453,360 ground truth answers

Try V7 now
->
Facebook AI Research
View author website
Task
Visual Question Answering
Annotation Types
Semantic Segmentation
28408
Items
2
Classes
28408
Labels
Models using this dataset
Last updated on 
October 31, 2023
Licensed under 
CC-BY
Blog
Learn about machine learning and latests advancements in AI.
Read More
Playbooks
Discover how to optimize AI for your business.
Learn more
Case Studies
Discover how V7 empowers AI industry greats.
Explore now
Webinars
Explore AI topics, gain insights, and learn from experts.
Watch now