Back

Distorted Document Images dataset (DDI-100)

Dataset for Text Detection and Recognition

Distorted Document Images dataset (DDI-100)

In order to facilitate a new document recognition research, we introduce a Distorted Document Images dataset (DDI-100). To create the dataset we collected 6658 unique document pages, and extended it by applying different types of distortions and geometric transformations. In total, DDI-100 contains 99870 document images together with text masks, stamp masks, text and character locations in terms of bounding boxes.

View this Dataset
->
View author website
Task
Text Detection
Annotation Types
Bounding Boxes
30000
Items
4
Classes
99870
Labels
Models using this dataset
Last updated on 
January 20, 2022
Licensed under 
Unknown