<- Back to Datasets

Distorted Document Images dataset (DDI-100)

Dataset for Text Detection and Recognition

Distorted Document Images dataset (DDI-100)

In order to facilitate a new document recognition research, we introduce a Distorted Document Images dataset (DDI-100). To create the dataset we collected 6658 unique document pages, and extended it by applying different types of distortions and geometric transformations. In total, DDI-100 contains 99870 document images together with text masks, stamp masks, text and character locations in terms of bounding boxes.

Distorted Document Images dataset (DDI-100)
30000
Items
Classes
99870
Labels
Models using this dataset
Last updated on 
January 5, 2022
Licensed under 
Unknown
Star