Back

The BreakingNews Dataset

Multimodal dataset for news article analysis

The BreakingNews Dataset

To foster research on multi-modal news article analysis, we propose the BreakingNews dataset, that includes images, captions, geo-location information and comments. This dataset includes approximately 100,000 news articles from several major newspapers and media agencies, collected between the 1st of January and the 31st of December of 2014. All articles include at least one image, and cover a wide variety of topics, including sports, politics, arts, healthcare or local news. The copyright of all text and images resides with the original owners.

Try V7 now
->
Institut de Robòtica i Informàtica Industrial, CSIC-UPC.
View author website
Task
Image Captioning
Annotation Types
Bounding Boxes
100000
Items
3
Classes
100000
Labels
Models using this dataset
Last updated on 
October 31, 2023
Licensed under 
Research Only
Blog
Learn about machine learning and latests advancements in AI.
Read More
Playbooks
Discover how to optimize AI for your business.
Learn more
Case Studies
Discover how V7 empowers AI industry greats.
Explore now
Webinars
Explore AI topics, gain insights, and learn from experts.
Watch now