Open Datasets

Browse 500+ open source datasets for your next machine learning project. Our list of free datasets keeps growing, so make sure you visit it frequently.

Oops! Something went wrong while submitting the form.
Filters
Use Cases
Clear
COVID-19 Chest X-ray Dataset
COVID-19 Chest X-ray Dataset
Lung segmentation of cases of COVID-19 collected in 2020
Polygons
V7
13008
Items
702
Classes
6504
Labels
CIFAR-10
CIFAR-10
The CIFAR-10 dataset with 1 tag per image
Image Classification
Classification Tags
V7
60000
Items
10
Classes
60000
Labels
Pens Segmentation
Pens Segmentation
Manually and semi-automatically segmented pens for instance segmentation experiments
Polygons
V7
420
Items
1
Classes
420
Labels
Items Picking Dataset
Items Picking Dataset
4600 instance segmentation labels of scattered objects for robotic picking
Polygons
V7
4636
Items
11
Classes
568
Labels
Stanford Dogs
Stanford Dogs
The stanford dogs dataset with 120 breeds and 1 tag per image
Image Classification
Bounding Boxes
V7
22126
Items
120
Classes
20580
Labels
Panoramic Dental Tooth Segmentation
Panoramic Dental Tooth Segmentation
Segmentations of teeth in OPG scans
Polygons
V7
3231
Items
1
Classes
116
Labels
Breast Duct Cell Segmentation
Breast Duct Cell Segmentation
23 thousand segmented breast duct cells in brightfield microscopy
Instance Segmentation
V7
24825
Items
5
Classes
409
Labels
Bird Species
Bird Species
Segmentation of three bird species aross 1000 images
Polygons
V7
3950
Items
3
Classes
1909
Labels
White Blood Cell
White Blood Cell
The White Blood Cell dataset with bounding boxes
Polygons
V7
781
Items
5
Classes
410
Labels
LAX Aircraft Detection
LAX Aircraft Detection
1600 instance segmentations of airplanes
Polygons
V7
1604
Items
1
Classes
72
Labels
PanNuke
PanNuke
Open Pan-Cancer Histology Dataset
Medical Images
Semantic Segmentation
20000
Items
312
Classes
20000
Labels
FMD
FMD
Fluorescence Microscopy Denoising dataset
Image Classification
Classification Tags
60000
Items
10
Classes
60000
Labels
m2caiseg
m2caiseg
Semantic Segmentation of Laparoscopic Surgical Images
3D Semantic Segmentation
Semantic Segmentation
307
Items
19
Classes
307
Labels
KaoKore Dataset
KaoKore Dataset
Collection of Facial Expressions from Japanese Artwork
Image Classification
Classification Tags
8842
Items
2
Classes
8848
Labels
DeepFake Detection
DeepFake Detection
A large dataset of visual deepfakes from Google
GAN / Image Generation
Image Pairs
Items
Classes
Labels
MASSVIS
MASSVIS
Massachusetts (Massive) Visualization Dataset
Data Visualization
Items
Classes
Labels
Holopix50k
Holopix50k
A Large-Scale In-the-wild Stereo Image Dataset
3D Reconstruction / Photogrammetry
Semantic Segmentation
49368
Items
12
Classes
49368
Labels
COVID-19 image data collection
COVID-19 image data collection
Open database of COVID-19 cases with chest X-ray or CT images
Medical Images
Semantic Segmentation
900
Items
6
Classes
900
Labels
MSeg
MSeg
A Composite Dataset for Multi-domain Semantic Segmentation
3D Semantic Segmentation
Semantic Segmentation
220000
Items
316
Classes
80000
Labels
NightOwls
NightOwls
Pedestrian detection at night
Object Detection
Bounding Boxes
CVPR
297000
Items
40
Classes
297000
Labels
PlotQA
PlotQA
A question-answering dataset for reasoning over scientific plots.
Object Detection
Bounding Boxes
IIT Madras
Items
Classes
Labels

Frequently Asked Questions

Looking for more materials to build trustworthy AI? Discover our resources page, packed with free guides, webinars, and V7 product updates.
Where to find machine learning datasets?

One of the best places to look for quality open source datasets is our own repository. You can use advanced filtering options and the search box to look for very specific datasets.

For example, if you’re only interested in a specific licence, such as public domain datasets, make sure to select the CC-0 option in the licence filter.

You can combine various filtering options to narrow down your search.

If you haven’t found what you’re looking for among the 500+ open datasets we’ve catalogued here, don’t despair—there are other places you may want to visit.

Start with these articles:

Each of them comes with detailed descriptions and links to online datasets for various purposes.

Can I use all public datasets on the V7 platform?

Yes, all sample datasets in our repository can be imported into V7.

What are the benefits of using open datasets?

Open datasets offer a number of benefits for computer vision projects. Firstly, they allow for easier collaboration between researchers. When data is openly available, researchers can more easily share and build upon each other’s work. This helps to accelerate the pace of research and allows for more innovative solutions to be found.

Secondly, open datasets help to ensure that the data used is of high quality. When data is openly available, it is subject to greater scrutiny from the research community. This helps to ensure that any flaws or errors in the data are quickly identified and corrected.

Finally, open datasets allow for replicability of results. When data is openly available, researchers can more easily check and verify each other’s results. This helps to build confidence in the findings of a study and allows for more reliable conclusions to be drawn.

Are all datasets on the list free?

Yes, all the online datasets in our repository are free to use. The only limitations may involve the scope of usage or requirements to attribute the dataset to its source. You can easily see what licence a given open dataset falls under on each dataset’s dedicated page.

What kinds of free datasets can I find here?

Our repository of open image datasets consists of free public datasets for computer vision projects. For your convenience we’ve divided them into several categories, e.g.:

Computer vision task types

Use cases

In fact, you can use advanced filtering options to browse our open image datasets by tasks, annotation types, use cases, or licence. Additionally, you can look for interesting datasets by typing a keyword or sorting the results alphabetically, by popularity or the number of images in the dataset.

What are open datasets?

An open dataset for machine learning is a dataset that is freely available for anyone. You can use them as datasets for projects to train and test your machine learning models.

Ready to get started?
Try our trial or talk to one of our experts.