Download

The entire dataset (1232 document images) is available upon request.

 

Citation

 

Please use the following reference if you use BDID with BASIC ground truth in your research:

 

@article{Cote2014Texture

year={2014},

issn={1433-2833},

journal={International Journal on Document Analysis and Recognition (IJDAR)},

volume={17},

number={3},

doi={10.1007/s10032-014-0217-8},

title={Texture sparseness for pixel classification of business document images},

url={http://dx.doi.org/10.1007/s10032-014-0217-8},

publisher={Springer Berlin Heidelberg},

author={Cote, Melissa and Branzan Albu, Alexandra},

pages={257-273},

}

 

Please use the following reference if you use BDID with LAYERED ground truth in your research:

 

@inproceedings{Cote2016Layered

author={ Cote, Melissa and Albu, Alexandra Branzan},

booktitle={2016 23rd International Conference on Pattern Recognition (ICPR)},

title={Layered ground truth: Conveying structural and statistical information for document image analysis and evaluation},

year={2016},

pages={3258-3263},

doi={10.1109/ICPR.2016.7900137},

organization={IEEE},

month={Dec},

}

Download (with basic ground truth)

Two sample subsets of the Business Document Image Dataset, composed of several dozen ground-truthed one-page business document images, are available for download. The first one (60 document images, 146 MB) includes basic ground truth while the second one (60 document images, 210 MB) includes layered ground truth. They are to be used for research purposes only.

 

 

 

Download (with layered ground truth)

Copyright © 2013-2017  |  Last modified on 2017/05/15

BDID

Business Document Image Dataset