NOTE: The dataset is publicly available for non-commercial use. Please refer to Risnumawan et al, ESWA, 2014 if you use this dataset in your publication.


We introduce the first curved text dataset to be made public, namely CUTE80 that consists of 80 curved text line images with complex background, perspective distortion effect and poor resolution effect (in circle, S, Z shaped text lines). CUTE80 is necessary in order to show the capability of the current text detection method in handling curved texts. These images are either indoor or outdoor images captured with a digital camera or retrieved from the Internet.

There are two(2) folders associated with the dataset and a ReadMe file:

  1. CUTE 80 Images
    • Folder name = CUTE80
    • Total Images = 80
  2. Groundtruth
    • Folder name = Groundtruth
    • Total File (XML) = 1
    • The ground truth is manually annotated containing a set of polygon points of the bounding box for each curved text line.


A Robust Arbitrary Text Detection System for Natural Scene Images
A. Risnumawan, P. Shivakumara, C.S. Chan and C.L. Tan
Expert Systems with Applications, vol. 41(18), pp. 8027-8048, 2014 (ESWA 2014)


This work is done jointly by University of Malaya (UM), Malaysia supported by the University of Malaya HIR under Grant Nos. UM.C/625/1/HIR/037, J0000073579 and National University of Singapore (NUS), Singapore. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the University of Malaya.

DISCLAIMER: Electronic paper shared on this website are for fast dissemination of research work. The copyrights of the papers belong to the corresponding publishers.