A curved-text dataset that outlived its own code
SCUT-CTW1500 was built for detecting bent text in the wild, but the project itself has gone stale.

What it does
SCUT-CTW1500 is a dataset and evaluation toolkit for detecting curved text in natural scenes — think street signs, product labels, or warped banners. It ships with 1,500 images (1,000 train, 500 test), detection and end-to-end recognition annotations, plus an evaluation script that also covers the Total-Text benchmark.
The interesting bit
The updated annotations add per-character point labels, which is granular enough to reconstruct text shapes with polygon precision. The authors also explicitly mask Chinese text as ‘###’ (ignore) because the sample is too small to be useful — a refreshingly honest admission rather than padding a benchmark with junk data.
Key highlights
- 1,500 images with curved and arbitrarily-shaped text instances
- Per-character point annotations for fine-grained shape modeling
- Evaluation script supports both detection-only and end-to-end recognition metrics
- Includes cross-dataset evaluation for Total-Text
- Free for academic use; commercial use requires contacting the authors
Caveats
- The project is explicitly marked as outdated and unmaintained — the authors direct you to OLD_README.md for original details
- Chinese text is deliberately excluded from training/testing, so don’t expect multilingual coverage
- Download links are scattered across Box and AARNet cloud storage, not versioned in the repo
Verdict
Worth a look if you’re benchmarking historical curved-text methods or need the specific polygon annotations for a legacy comparison. Skip it if you want maintained tooling — the field has moved on to newer datasets like ArT and LSVT, which the authors themselves point toward.