Open Datasets

Browse 500+ open source datasets for your next machine learning project. Our list of free datasets keeps growing, so make sure you visit it frequently.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Filters
Use Cases
Clear
COVID-19 Chest X-ray Dataset
COVID-19 Chest X-ray Dataset
Lung segmentation of cases of COVID-19 collected in 2020
Polygons
V7
13008
Items
702
Classes
6504
Labels
CIFAR-10
CIFAR-10
The CIFAR-10 dataset with 1 tag per image
Image Classification
Classification Tags
V7
60000
Items
10
Classes
60000
Labels
Pens Segmentation
Pens Segmentation
Manually and semi-automatically segmented pens for instance segmentation experiments
Polygons
V7
420
Items
1
Classes
420
Labels
Items Picking Dataset
Items Picking Dataset
4600 instance segmentation labels of scattered objects for robotic picking
Polygons
V7
4636
Items
11
Classes
568
Labels
Stanford Dogs
Stanford Dogs
The stanford dogs dataset with 120 breeds and 1 tag per image
Image Classification
Bounding Boxes
V7
22126
Items
120
Classes
20580
Labels
Panoramic Dental Tooth Segmentation
Panoramic Dental Tooth Segmentation
Segmentations of teeth in OPG scans
Polygons
V7
3231
Items
1
Classes
116
Labels
Breast Duct Cell Segmentation
Breast Duct Cell Segmentation
23 thousand segmented breast duct cells in brightfield microscopy
Instance Segmentation
V7
24825
Items
5
Classes
409
Labels
Bird Species
Bird Species
Segmentation of three bird species aross 1000 images
Polygons
V7
3950
Items
3
Classes
1909
Labels
White Blood Cell
White Blood Cell
The White Blood Cell dataset with bounding boxes
Polygons
V7
781
Items
5
Classes
410
Labels
LAX Aircraft Detection
LAX Aircraft Detection
1600 instance segmentations of airplanes
Polygons
V7
1604
Items
1
Classes
72
Labels
PanNuke
PanNuke
Open Pan-Cancer Histology Dataset
Medical Images
Semantic Segmentation
20000
Items
312
Classes
20000
Labels
FMD
FMD
Fluorescence Microscopy Denoising dataset
Image Classification
Classification Tags
60000
Items
10
Classes
60000
Labels
m2caiseg
m2caiseg
Semantic Segmentation of Laparoscopic Surgical Images
3D Semantic Segmentation
Semantic Segmentation
307
Items
19
Classes
307
Labels
KaoKore Dataset
KaoKore Dataset
Collection of Facial Expressions from Japanese Artwork
Image Classification
Classification Tags
8842
Items
2
Classes
8848
Labels
DeepFake Detection
DeepFake Detection
A large dataset of visual deepfakes from Google
GAN / Image Generation
Image Pairs
Items
Classes
Labels
MASSVIS
MASSVIS
Massachusetts (Massive) Visualization Dataset
Data Visualization
Items
Classes
Labels
Holopix50k
Holopix50k
A Large-Scale In-the-wild Stereo Image Dataset
3D Reconstruction / Photogrammetry
Semantic Segmentation
49368
Items
12
Classes
49368
Labels
COVID-19 image data collection
COVID-19 image data collection
Open database of COVID-19 cases with chest X-ray or CT images
Medical Images
Semantic Segmentation
900
Items
6
Classes
900
Labels
MSeg
MSeg
A Composite Dataset for Multi-domain Semantic Segmentation
3D Semantic Segmentation
Semantic Segmentation
220000
Items
316
Classes
80000
Labels
NightOwls
NightOwls
Pedestrian detection at night
Object Detection
Bounding Boxes
CVPR
297000
Items
40
Classes
297000
Labels
PlotQA
PlotQA
A question-answering dataset for reasoning over scientific plots.
Object Detection
Bounding Boxes
IIT Madras
Items
Classes
Labels
PadChest
PadChest
A large chest x-ray image dataset with multi-label annotated reports
Medical Images
Semantic Segmentation
BIMCV
160000
Items
297
Classes
160000
Labels
Earth on AWS
Earth on AWS
Planetary-scale open geospatial data
Remote Sensing
3D Point Cloud
Items
Classes
Labels
The Boxy Vehicles Dataset
The Boxy Vehicles Dataset
A large dataset of two million annotated vehicles
Object Detection
Bounding Boxes
BOSCH
1900000
Items
11
Classes
200000
Labels
LVIS v0.5
LVIS v0.5
A DATASET FOR LARGE VOCABULARY INSTANCE SEGMENTATION
Object Detection
Bounding Boxes
Facebook AI Research
164000
Items
1000
Classes
164000
Labels
UOPNOA & UOS2
UOPNOA & UOS2
PNOA and Sentinel-2 datasets for semantic segmentation for crop classification and land use
3D Semantic Segmentation
Semantic Segmentation
ZZENODO
Items
Classes
Labels
3DPeople Dataset
3DPeople Dataset
A Large Dataset of Dressed Humans
3D Semantic Segmentation
Keypoint Skeleton
Institut de Robòtica i Informàtica Industrial, CSIC-UPC
280
Items
70
Classes
2000000
Labels
Exclusively Dark (ExDark) Image Dataset
Exclusively Dark (ExDark) Image Dataset
Large collection of low-light images with image class and object annotations
Medical Images
Bounding Boxes
12
Items
12
Classes
7363
Labels
Oxford-IIIT Pets
Oxford-IIIT Pets
Oxford-IIIT Pets
Image Classification
Bounding Boxes
University of Oxford
200
Items
27
Classes
200
Labels
Quick, Draw Dataset
Quick, Draw Dataset
50 million drawings from the world
Image Classification
Classification Tags
50000000
Items
Classes
50000000
Labels
SVD
SVD
Short Video Dataset
Video Classification
Classification Tags
nju
500000
Items
30000
Classes
500000
Labels
IntrA
IntrA
3D Intracranial Aneurysm Dataset for Deep Learning
Medical Images
Semantic Segmentation
AMED
116
Items
Classes
Labels
LabelMe
LabelMe
Large dataset for object detection with labeling tool
Image Classification
Classification Tags
MIT, Computer Science and Artificial Intelligence Laboratory.
Items
Classes
Labels
Ford Autonomous Vehicle Dataset
Ford Autonomous Vehicle Dataset
Multi-agent seasonal datasets collected by Ford fleet
Autonomous Driving
Bounding Boxes
Ford
Items
Classes
Labels
The Massively Multilingual Image Dataset
The Massively Multilingual Image Dataset
Words and their images in 100 languages
Image Classification
Classification Tags
University of Pennsylvania
Items
Classes
Labels
D²-City
D²-City
A Large-scale Video Dataset of Diverse Traffic Scenarios in China
Autonomous Driving
Bounding Boxes
10000
Items
12
Classes
10000
Labels
NEXET
NEXET
A large and diverse road dataset
Autonomous Driving
Bounding Boxes
41190
Items
10
Classes
50000
Labels
Databrary
Databrary
Video data library for behavior scientists
Video Classification
Bounding Boxes
New York University
Items
Classes
Labels
ACAV100M
ACAV100M
Automatic Curation of Large-Scale Datasets for Audio-Visual Video Representation Learning
Video Classification
Classification Tags
Seoul National University
140000000
Items
5
Classes
140000000
Labels
BUFF
BUFF
Bodies Under Flowing Fashion
3D Object Pose
Keypoint
Max-Planck-Gesellschaft
Items
Classes
Labels
ASTYX HIRES2019 DATASET
ASTYX HIRES2019 DATASET
First automotive dataset containing imaging radar data
Autonomous Driving
Bounding Boxes
CRUISE MUNICH
Items
Classes
Labels
A Multiclass Weed Species Image Dataset
A Multiclass Weed Species Image Dataset
A Multiclass Weed Species Image Dataset
Image Classification
Classification Tags
Items
Classes
Labels
Vehicle-Rear
Vehicle-Rear
A novel dataset for vehicle re-identification
Image Classification
Classification Tags
Federal University of Technology - Parana, Curitiba, Brazil
Items
Classes
Labels
GDXray
GDXray
X-ray images for nondestructive testing
Object Detection
Bounding Boxes
Universidad Catolica de Chile
72
Items
5
Classes
72
Labels
NVGesture
NVGesture
Dynamic gestures of touchless driving control
3D Object Detection
Bounding Boxes
NVIDIA Corporation
1532
Items
25
Classes
1532
Labels
AFHQ
AFHQ
Animal Faces-HQ
Face Recognition
Bounding Boxes
15000
Items
3
Classes
15000
Labels
Volleyball
Volleyball
Volleyball action recognition dataset
Event Detection
Bounding Boxes
4803
Items
17
Classes
4830
Labels
SDSS Galaxies
SDSS Galaxies
Large dataset of galaxy images
GAN / Image Generation
Image Pairs
University of Hertfordshire, Hatfield
306006
Items
10
Classes
306006
Labels
VOT2019
VOT2019
Visual Object Tracking benchmark for short-term tracking
Video Object Tracking
Bounding Boxes
Academic and Research Network of Slovenia (ARNES).
Items
Classes
Labels
AnimalWeb
AnimalWeb
A Large-Scale Hierarchical Dataset of Annotated Animal Faces
Face Keypoint Estimation
Keypoint
Computer Vision Laboratory, School of Computer Science
22400
Items
350
Classes
22400
Labels

Frequently Asked Questions

Looking for more materials to build trustworthy AI? Discover our resources page, packed with free guides, webinars, and V7 product updates.
Where to find machine learning datasets?

One of the best places to look for quality open source datasets is our own repository. You can use advanced filtering options and the search box to look for very specific datasets.

For example, if you’re only interested in a specific licence, such as public domain datasets, make sure to select the CC-0 option in the licence filter.

You can combine various filtering options to narrow down your search.

If you haven’t found what you’re looking for among the 500+ open datasets we’ve catalogued here, don’t despair—there are other places you may want to visit.

Start with these articles:

Each of them comes with detailed descriptions and links to online datasets for various purposes.

Can I use all public datasets on the V7 platform?

Yes, all sample datasets in our repository can be imported into V7.

What are the benefits of using open datasets?

Open datasets offer a number of benefits for computer vision projects. Firstly, they allow for easier collaboration between researchers. When data is openly available, researchers can more easily share and build upon each other’s work. This helps to accelerate the pace of research and allows for more innovative solutions to be found.

Secondly, open datasets help to ensure that the data used is of high quality. When data is openly available, it is subject to greater scrutiny from the research community. This helps to ensure that any flaws or errors in the data are quickly identified and corrected.

Finally, open datasets allow for replicability of results. When data is openly available, researchers can more easily check and verify each other’s results. This helps to build confidence in the findings of a study and allows for more reliable conclusions to be drawn.

Are all datasets on the list free?

Yes, all the online datasets in our repository are free to use. The only limitations may involve the scope of usage or requirements to attribute the dataset to its source. You can easily see what licence a given open dataset falls under on each dataset’s dedicated page.

What kinds of free datasets can I find here?

Our repository of open image datasets consists of free public datasets for computer vision projects. For your convenience we’ve divided them into several categories, e.g.:

Computer vision task types

Use cases

In fact, you can use advanced filtering options to browse our open image datasets by tasks, annotation types, use cases, or licence. Additionally, you can look for interesting datasets by typing a keyword or sorting the results alphabetically, by popularity or the number of images in the dataset.

What are open datasets?

An open dataset for machine learning is a dataset that is freely available for anyone. You can use them as datasets for projects to train and test your machine learning models.

Ready to get started?
Try our trial or talk to one of our experts.