Filters
Use Cases
Clear

Open Datasets

Browse 500+ open source datasets for your next machine learning project. Our list of free datasets keeps growing, so make sure you visit it frequently.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Bird Species
Browsable
Bird Species
Segmentation of three bird species aross 1000 images
Polygons
V7
3950
Items
3
Classes
1909
Labels
Breast Duct Cell Segmentation
Browsable
Breast Duct Cell Segmentation
23 thousand segmented breast duct cells in brightfield microscopy
Instance Segmentation
V7
24825
Items
5
Classes
409
Labels
CIFAR-10
Browsable
CIFAR-10
The CIFAR-10 dataset with 1 tag per image
Image Classification
Classification Tags
V7
60000
Items
10
Classes
60000
Labels
COVID-19 Chest X-ray Dataset
Browsable
COVID-19 Chest X-ray Dataset
Lung segmentation of cases of COVID-19 collected in 2020
Polygons
V7
13008
Items
702
Classes
6504
Labels
Items Picking Dataset
Browsable
Items Picking Dataset
4600 instance segmentation labels of scattered objects for robotic picking
Polygons
V7
4636
Items
11
Classes
568
Labels
LAX Aircraft Detection
Browsable
LAX Aircraft Detection
1600 instance segmentations of airplanes
Polygons
V7
1604
Items
1
Classes
72
Labels
Panoramic Dental Tooth Segmentation
Browsable
Panoramic Dental Tooth Segmentation
Segmentations of teeth in OPG scans
Polygons
V7
3231
Items
1
Classes
116
Labels
Pens Segmentation
Browsable
Pens Segmentation
Manually and semi-automatically segmented pens for instance segmentation experiments
Polygons
V7
420
Items
1
Classes
420
Labels
Stanford Dogs
Browsable
Stanford Dogs
The stanford dogs dataset with 120 breeds and 1 tag per image
Image Classification
Bounding Boxes
V7
22126
Items
120
Classes
20580
Labels
White Blood Cell
Browsable
White Blood Cell
The White Blood Cell dataset with bounding boxes
Polygons
V7
781
Items
5
Classes
410
Labels
1 million fake faces
Browsable
1 million fake faces
1 million fake faces created with an AI
GAN / Image Generation
Keypoint Skeleton
1000000
Items
100
Classes
1000000
Labels
300W
Browsable
300W
300 Faces-In-The-Wild
Face Recognition
Keypoint
600
Items
3
Classes
399
Labels
3D Hand Pose
Browsable
3D Hand Pose
Large-scale Multiview 3D Hand Pose Dataset
Object Detection
Bounding Boxes
ROVIT
Items
Classes
Labels
3D-ZeF
Browsable
3D-ZeF
A 3D Zebrafish Tracking Benchmark Dataset
Video Object Tracking
Bounding Boxes
VISUAL ANALYSIS & PERCEPTION LAB
Items
Classes
Labels
3D60 Dataset
Browsable
3D60 Dataset
3D Vision Indoors Spherical Panoramas
Scene Understanding
Semantic Segmentation
224406
Items
19
Classes
224406
Labels
3DFAW
Browsable
3DFAW
3D facial keypoint dataset
Face Keypoint Estimation
Keypoint
Multimedia and Human Understanding Group
23000
Items
66
Classes
23000
Labels
3DPeople Dataset
Browsable
3DPeople Dataset
A Large Dataset of Dressed Humans
3D Semantic Segmentation
Keypoint Skeleton
Institut de Robòtica i Informàtica Industrial, CSIC-UPC
280
Items
70
Classes
2000000
Labels
3R-Scan
Browsable
3R-Scan
A large scale dataset for 3D object instance re-localization
3D Object Pose
Keypoint
Technical University of Munich
Items
Classes
Labels
6DoF WIDER Face Annotations - img2pose
Browsable
6DoF WIDER Face Annotations - img2pose
Six degrees of freedom annotations for WIDER FACE
Face Alignment
Classification Tags
Multimedia Laboratory, Department of Information Engineering, The Chinese University of Hong Kong
400000
Items
Classes
400000
Labels
A Multiclass Weed Species Image Dataset
Browsable
A Multiclass Weed Species Image Dataset
A Multiclass Weed Species Image Dataset
Image Classification
Classification Tags
Items
Classes
Labels
A*3D
Browsable
A*3D
An Autonomous Driving Dataset in Challenging Environments
Autonomous Driving
3D Point Cloud
230000
Items
7
Classes
39179
Labels
A2D2
Browsable
A2D2
Audi Autonomous Driving Dataset
Autonomous Driving
Bounding Boxes
Audi AG
400000
Items
30
Classes
400000
Labels
ABC Dataset
Browsable
ABC Dataset
A Big CAD Model Dataset For Geometric Deep Learning
3D Reconstruction / Photogrammetry
Semantic Segmentation
Items
Classes
Labels
ACAV100M
Browsable
ACAV100M
Automatic Curation of Large-Scale Datasets for Audio-Visual Video Representation Learning
Video Classification
Classification Tags
Seoul National University
140000000
Items
5
Classes
140000000
Labels
AFHQ
Browsable
AFHQ
Animal Faces-HQ
Face Recognition
Bounding Boxes
15000
Items
3
Classes
15000
Labels
AFLW2000-3D
Browsable
AFLW2000-3D
AFLW2000-3D
Face Keypoint Estimation
Semantic Segmentation
Institute of Automation, Chinese Academy of Sciences
2000
Items
3
Classes
2000
Labels
AIST++
Browsable
AIST++
Dance Motion Dataset
Human Pose Estimation
Keypoint
10108015
Items
30
Classes
10108015
Labels
AMASS
Browsable
AMASS
Archive of Motion Capture as Surface Shapes
Human Pose Estimation
Keypoint Skeleton
ICCV
11000
Items
300
Classes
11000
Labels
ARDIS
Browsable
ARDIS
a Swedish historical handwritten digit dataset
Image Classification
Bounding Boxes
Blekinge Tekniska Högskola, SE-371 79, Karlskrona, Sweden
10000
Items
10
Classes
10000
Labels
ARID Dataset
Browsable
ARID Dataset
A Benchmark Dataset for Action Recognition in Dark Videos
Video Classification
Classification Tags
3780
Items
11
Classes
3780
Labels
ASTYX HIRES2019 DATASET
Browsable
ASTYX HIRES2019 DATASET
First automotive dataset containing imaging radar data
Autonomous Driving
Bounding Boxes
CRUISE MUNICH
Items
Classes
Labels
AU-AIR Dataset
Browsable
AU-AIR Dataset
Multi-modal UAV Dataset for Low Altitude Traffic Surveillance
Object Detection
Bounding Boxes
32832
Items
8
Classes
132034
Labels
AVAMVG
Browsable
AVAMVG
The AVA Multi-View Gait Dataset.
Human Pose Estimation
Semantic Segmentation
Items
Classes
Labels
AViD
Browsable
AViD
Anonymized Videos from Diverse Countries
Video Classification
Classification Tags
467000
Items
887
Classes
467000
Labels
Active Vision Dataset
Browsable
Active Vision Dataset
A Dataset for Developing and Benchmarking Active Vision
Object Detection
Bounding Boxes
70000
Items
15
Classes
30000
Labels
Agriculture-Vision
Browsable
Agriculture-Vision
A large aerial image dataset for agricultural pattern analysis
3D Semantic Segmentation
Semantic Segmentation
UIUC
3432
Items
Classes
94986
Labels
Aida
Browsable
Aida
Calculus Math Handwriting Recognition Dataset
Text Detection
Bounding Boxes
Aida
100000
Items
10
Classes
100000
Labels
Airport
Browsable
Airport
Person re-id dataset at airport
Person Re-Identification
Bounding Boxes
Senstar Corporation
39902
Items
9651
Classes
39902
Labels
Alzheimer's Research Data
Browsable
Alzheimer's Research Data
From Alzheimer’s Disease Neuroimaging Initiative
Medical Images
Semantic Segmentation
Alzheimer’s Disease Neuroimaging Initiative
Items
Classes
Labels
Amazon Berkeley Objects (ABO) Dataset
Browsable
Amazon Berkeley Objects (ABO) Dataset
A dataset of Amazon products with metadata, catalog images, and 3D models.
Object Detection
Bounding Boxes
Amazon
50000
Items
221
Classes
586584
Labels
Amazon Product Data
Browsable
Amazon Product Data
product reviews and metadata from Amazon
Image Classification
Classification Tags
Items
Classes
Labels
Animal-Pose Dataset
Browsable
Animal-Pose Dataset
Dataset with animal pose annotations
Object Detection
Bounding Boxes
Shanghai Jiao Tong University
20
Items
5
Classes
4000
Labels
AnimalWeb
Browsable
AnimalWeb
A Large-Scale Hierarchical Dataset of Annotated Animal Faces
Face Keypoint Estimation
Keypoint
Computer Vision Laboratory, School of Computer Science
22400
Items
350
Classes
22400
Labels
Animals on the Web
Browsable
Animals on the Web
10 animal categories collected from the web
Image Classification
Semantic Segmentation
Items
Classes
Labels
Anime Face Dataset
Browsable
Anime Face Dataset
A collection of high-quality anime faces
Face Detection
Bounding Boxes
63632
Items
Classes
63632
Labels
ApolloCar3D
Browsable
ApolloCar3D
A Large 3D Car Instance Understanding Benchmark for Autonomous Driving
Autonomous Driving
Bounding Boxes
APOLLOSCAPE
Items
Classes
Labels
ApolloScape
Browsable
ApolloScape
Open data for autonomous driving from Baidu Apollo project
Autonomous Driving
Semantic Segmentation
Items
Classes
Labels
Arabic Handwritten Characters Dataset
Browsable
Arabic Handwritten Characters Dataset
MNIST-like dataset for arabic handwritten characters
Image Classification
Classification Tags
13440
Items
480
Classes
13440
Labels
Arabic Handwritten Digits Dataset
Browsable
Arabic Handwritten Digits Dataset
MNIST-type dataset for arabic digits
Image Classification
Classification Tags
Benha University
60000
Items
10
Classes
60000
Labels
Argoverse
Browsable
Argoverse
Self-driving car datasets with highly detailed maps
Autonomous Driving
Bounding Boxes
Argo AI, LLC
327793
Items
113
Classes
327793
Labels
ArtVQA
Browsable
ArtVQA
Dataset and Baselines for Visual Question Answering on Art
Visual Question Answering
Semantic Segmentation
VISART workshop at ECCV 2020
Items
Classes
Labels
Audio-Visual Event (AVE) Dataset
Browsable
Audio-Visual Event (AVE) Dataset
Dataset for audio-visual video understanding research
Event Detection
Bounding Boxes
University of Rochester
4143
Items
28
Classes
4143
Labels
B-MOD
Browsable
B-MOD
Brno Mobile OCR Dataset
Text Detection
Bounding Boxes
FIT BUT
19725
Items
10
Classes
19725
Labels
BDD100K
Browsable
BDD100K
The largest driving video dataset to date
Autonomous Driving
Semantic Segmentation
Mapillary
80000
Items
40
Classes
80000
Labels
BIMCV-COVID19+
Browsable
BIMCV-COVID19+
A large annotated dataset of RX and CT images of COVID19 patients
Medical Images
Semantic Segmentation
M.I.V
2300
Items
23
Classes
2300
Labels
BUFF
Browsable
BUFF
Bodies Under Flowing Fashion
3D Object Pose
Keypoint
Max-Planck-Gesellschaft
Items
Classes
Labels
Bark-101 Dataset
Browsable
Bark-101 Dataset
Image dataset of 101 bark classes
Image Classification
Classification Tags
2587
Items
101
Classes
2587
Labels
Birdsnap
Browsable
Birdsnap
Birdsnap
Image Classification
Classification Tags
Columbia University
47386
Items
500
Classes
48829
Labels
Bitewing Radiology Dataset
Browsable
Bitewing Radiology Dataset
Dental caries detection in bitewing radiography
Medical Images
Semantic Segmentation
80
Items
4
Classes
80
Labels
BlendedMVS
Browsable
BlendedMVS
A Large-scale Dataset for Generalized Multi-view Stereo Networks
3D Reconstruction / Photogrammetry
3D Point Cloud
110000
Items
113
Classes
110000
Labels
Bosch Small Traffic Lights Dataset
Browsable
Bosch Small Traffic Lights Dataset
Dataset for vision-based traffic light detection
Object Detection
Bounding Boxes
Bosch North America Research department, Palo Alto, California.
24000
Items
19
Classes
13427
Labels
BraTS
Browsable
BraTS
Multimodal Brain Tumor Segmentation Challenge 2017
Medical Images
Semantic Segmentation
University of Pennsylvania
Items
Classes
Labels
BuildingNet
Browsable
BuildingNet
A large-scale dataset of 3D building models
3D Semantic Segmentation
3D Point Cloud
UMass Amherst
513000
Items
2000
Classes
292000
Labels
CACD
Browsable
CACD
Cross-Age Celebrity Dataset
Image Classification
Classification Tags
163446
Items
5
Classes
163446
Labels
CADP
Browsable
CADP
A Novel Dataset for CCTV Traffic Camera based Accident Analysis
Object Detection
Bounding Boxes
Carnegie Mellon University, Language Technologies Institute
Items
Classes
Labels
CAMO
Browsable
CAMO
Camouflaged Object
Semantic Segmentation
Semantic Segmentation
Anabranch Network for Camouflaged Object Segmentation
1250
Items
2
Classes
1250
Labels
CARPK
Browsable
CARPK
car parking lot dataset
Object Detection
Bounding Boxes
Spatially Regularized Regional Proposal Networks
Items
Classes
Labels
CASIA-B
Browsable
CASIA-B
CASIA-B
Gait Recognition
Bounding Boxes
Institute of Automation, Chinese Academy of Sciences
Items
Classes
Labels
CAT2000
Browsable
CAT2000
MIT saliency benchmark
Image Classification
Classification Tags
4000
Items
20
Classes
4000
Labels
CATER
Browsable
CATER
A diagnostic dataset for Compositional Actions and TEmporal Reasoning
Video Classification
Classification Tags
Items
Classes
Labels
CCIHP
Browsable
CCIHP
Characterized Crowd Instance-level Human Parsing
3D Semantic Segmentation
Semantic Segmentation
31
Items
20
Classes
28280
Labels
CCPD
Browsable
CCPD
A large and comprehensive license plate dataset
Object Detection
Bounding Boxes
250000
Items
8
Classes
250000
Labels
CED
Browsable
CED
First Color Event Camera Dataset
Image Classification
Classification Tags
University of zurich
Items
Classes
Labels
CINIC-10
Browsable
CINIC-10
Drop-in replacement for CIFAR-10
Image Classification
Classification Tags
270000
Items
8
Classes
270000
Labels
CMU Panoptic
Browsable
CMU Panoptic
CMU Panoptic Dataset
Human Pose Estimation
Keypoint
1500000
Items
Classes
1500000
Labels
COCO Africa Masks
Browsable
COCO Africa Masks
A Curation Tool and Dataset of Common Objects in the Context of Africa
GAN / Image Generation
Image Pairs
COCO-Africa
Items
Classes
Labels
COCO Captions
Browsable
COCO Captions
COCO Captions
Image Captioning
Keypoint
University of Washington
1500000
Items
171
Classes
330000
Labels
COCO-Text
Browsable
COCO-Text
COCO-Text
Text Detection
Semantic Segmentation
239506
Items
6
Classes
63686
Labels
COIN
Browsable
COIN
A Large-scale Dataset for Comprehensive Instructional Video Analysis
Video Classification
Classification Tags
Tsinghua University
46354
Items
Classes
46354
Labels
CORe50
Browsable
CORe50
Dataset for Continual Learning and Object Recognition, Detection, Segmentation
Object Detection
Semantic Segmentation
University of Bologna
164886
Items
50
Classes
164886
Labels
COVID-19 Chest CT image Augmentation GAN Dataset
Browsable
COVID-19 Chest CT image Augmentation GAN Dataset
COVID-19 CoronaVirus Dataset Collection CT
Medical Images
Semantic Segmentation
MDPI
742
Items
2
Classes
742
Labels
COVID-19 Medical Face Mask Detection Dataset
Browsable
COVID-19 Medical Face Mask Detection Dataset
COVID-19 CoronaVirus Face Mask Collection
Image Classification
Classification Tags
ELSEVIER
1415
Items
3
Classes
1415
Labels
COVID-19 image data collection
Browsable
COVID-19 image data collection
Open database of COVID-19 cases with chest X-ray or CT images
Medical Images
Semantic Segmentation
900
Items
6
Classes
900
Labels
COVID-CT
Browsable
COVID-CT
CT images with clinical findings of COVID-19
Medical Images
Semantic Segmentation
275
Items
2
Classes
275
Labels
COVIDx
Browsable
COVIDx
Open Source Chest Radiography Datasets for COVID-19
Medical Images
Semantic Segmentation
University of Waterloo, Canada
16756
Items
2800
Classes
16756
Labels
CQ500
Browsable
CQ500
A dataset of head CT scans
Medical Images
Semantic Segmentation
qure.ai
Items
Classes
Labels
CRUW
Browsable
CRUW
CRUW is a public camera-radar dataset for autonomous vehicle applications
Autonomous Driving
Bounding Boxes
WACV
Items
Classes
Labels
CUFS
Browsable
CUFS
CUHK Face Sketch Database
GAN / Image Generation
Image Pairs
606
Items
7
Classes
606
Labels
CUHK-SYSU
Browsable
CUHK-SYSU
CUHK-SYSU Person Search Dataset
Face Detection
Bounding Boxes
18184
Items
8432
Classes
18184
Labels
CUHK01
Browsable
CUHK01
CUHK Person Re-identification
24312
Items
3
Classes
24312
Labels
CURE-TSD
Browsable
CURE-TSD
Challenging Unreal and Real Environments for Traffic Sign Detection
Object Detection
Bounding Boxes
OLIVES Lab, Georgia Institute of Technology
2989
Items
2
Classes
2989
Labels
Cam2BEV
Browsable
Cam2BEV
Synthetic, semantically segmented road-scene images
3D Semantic Segmentation
Semantic Segmentation
Items
Classes
Labels
Canadian Adverse Driving Conditions Dataset
Browsable
Canadian Adverse Driving Conditions Dataset
Open-source dataset for autonomous driving in wintry weather
Autonomous Driving
Bounding Boxes
University of Waterloo, Canada
10
Items
10
Classes
56000
Labels
CanadianBuildingFootprints
Browsable
CanadianBuildingFootprints
Dataset of Building Footprints in Canada
3D Semantic Segmentation
Semantic Segmentation
Microsoft Bing Map
12000000
Items
13
Classes
12000000
Labels
CapGaze
Browsable
CapGaze
Human attention in image captioning
Visual Attention
Bounding Boxes
4000
Items
Classes
4000
Labels
Cataracts Challenge Dataset
Browsable
Cataracts Challenge Dataset
Videos of Cataract Surgeries
Medical Images
Semantic Segmentation
Brest University Hospital
Items
Classes
Labels
Celeb-DF
Browsable
Celeb-DF
A Large-scale Challenging Dataset for DeepFake Forensics
DeepFakes
Bounding Boxes
deepfakeforensics
5639
Items
3
Classes
5639
Labels
CelebA-Spoof
Browsable
CelebA-Spoof
Large-scale face anti-spoofing dataset
Face Recognition
Bounding Boxes
Beijing Jiaotong University, Beijing, China
625537
Items
10177
Classes
625537
Labels
Chairs
Browsable
Chairs
Chairs
Object Detection
Bounding Boxes
1000
Items
2
Classes
1000
Labels
Chalearn CASIA-SURF Dataset
Browsable
Chalearn CASIA-SURF Dataset
Large-scale face Anti-spoofing dataset
Face Detection
Bounding Boxes
21000
Items
1000
Classes
21000
Labels
Loading more datasets...
COVID-19 Chest X-ray Dataset
COVID-19 Chest X-ray Dataset
Lung segmentation of cases of COVID-19 collected in 2020
Polygons
COVID-19 Chest X-ray Dataset
13008
Items
702
Classes
6504
Labels
CIFAR-10
CIFAR-10
The CIFAR-10 dataset with 1 tag per image
Image Classification
Classification Tags
CIFAR-10
60000
Items
10
Classes
60000
Labels
Items Picking Dataset
Items Picking Dataset
4600 instance segmentation labels of scattered objects for robotic picking
Polygons
Items Picking Dataset
4636
Items
11
Classes
568
Labels
Pens Segmentation
Pens Segmentation
Manually and semi-automatically segmented pens for instance segmentation experiments
Polygons
Pens Segmentation
420
Items
1
Classes
420
Labels
Stanford Dogs
Stanford Dogs
The stanford dogs dataset with 120 breeds and 1 tag per image
Image Classification
Bounding Boxes
Stanford Dogs
22126
Items
120
Classes
20580
Labels
Panoramic Dental Tooth Segmentation
Panoramic Dental Tooth Segmentation
Segmentations of teeth in OPG scans
Polygons
Panoramic Dental Tooth Segmentation
3231
Items
1
Classes
116
Labels
Bird Species
Bird Species
Segmentation of three bird species aross 1000 images
Polygons
Bird Species
3950
Items
3
Classes
1909
Labels
Breast Duct Cell Segmentation
Breast Duct Cell Segmentation
23 thousand segmented breast duct cells in brightfield microscopy
Instance Segmentation
Breast Duct Cell Segmentation
24825
Items
5
Classes
409
Labels
White Blood Cell
White Blood Cell
The White Blood Cell dataset with bounding boxes
Polygons
White Blood Cell
781
Items
5
Classes
410
Labels
LAX Aircraft Detection
LAX Aircraft Detection
1600 instance segmentations of airplanes
Polygons
LAX Aircraft Detection
1604
Items
1
Classes
72
Labels

What are open datasets?

An open dataset for machine learning is a dataset that is freely available for anyone. You can use them as datasets for projects to train and test your machine learning models. Some of the public datasets in our library can be easily browsed on the V7 platform with a click of a button. Other datasets need to be downloaded. We’ve marked the browsable ones with a V7 icon.

What kinds of free datasets can I find here?

Our repository of open image datasets consists of free public datasets for computer vision projects. For your convenience we’ve divided them into several categories, e.g.:

Computer vision task types

Use cases

In fact, you can use advanced filtering options to browse our open image datasets by tasks, annotation types, use cases, or licence. Additionally, you can look for interesting datasets by typing a keyword or sorting the results alphabetically, by popularity or the number of images in the dataset.

What does an example dataset look like?

While most datasets on our list must be downloaded and imported into an image annotation tool of your choice to view the images, you can easily look inside the open datasets hosted on V7.

All you need to do is look for a result card with a blue Browsable tag in the top right corner and a V7 icon in the description.

Click on the card, and go to the open dataset’s page. There, in the right-hand panel, click on the View this Dataset button.

After clicking the button, you’ll see all the images from the dataset.

You can click on any image in the open dataset to see the annotations.

Are all datasets on the list free?

Yes, all the online datasets in our repository are free to use. The only limitations may involve the scope of usage or requirements to attribute the dataset to its source. You can easily see what licence a given open dataset falls under on each dataset’s dedicated page.

What are the benefits of using open datasets?

Open datasets offer a number of benefits for computer vision projects. Firstly, they allow for easier collaboration between researchers. When data is openly available, researchers can more easily share and build upon each other’s work. This helps to accelerate the pace of research and allows for more innovative solutions to be found.

Secondly, open datasets help to ensure that the data used is of high quality. When data is openly available, it is subject to greater scrutiny from the research community. This helps to ensure that any flaws or errors in the data are quickly identified and corrected.

Finally, open datasets allow for replicability of results. When data is openly available, researchers can more easily check and verify each other’s results. This helps to build confidence in the findings of a study and allows for more reliable conclusions to be drawn.

Can I use all public datasets on the V7 platform?

Yes, all sample datasets in our repository can be imported into V7.

Where to find machine learning datasets?

One of the best places to look for quality open source datasets is our own repository. You can use advanced filtering options and the search box to look for very specific datasets.

For example, if you’re only interested in a specific licence, such as public domain datasets, make sure to select the CC-0 option in the licence filter. 

You can combine various filtering options to narrow down your search. 

If you haven’t found what you’re looking for among the 500+ open datasets we’ve catalogued here, don’t despair—there are other places you may want to visit.

Start with these articles:

Each of them comes with detailed descriptions and links to online datasets for various purposes.

Gain control of your training data
15,000+ ML engineers can’t be wrong