← all repositories
waspinator/pycococreator

The COCO dataset format is a pain; this makes it slightly less so

Helper functions that translate your masks and polygons into the exact JSON structure COCO demands, because doing it by hand is masochism.

784 stars Python Data Tooling
pycococreator
Velocity · 7d
+0.3
★ / day
Trend
steady
star history

What it does

pycococreator is a small Python utility for generating COCO-format annotations. It handles the two main encoding styles: uncompressed RLE (used for “crowd” annotations) and polygons. You feed it masks or contours; it spits out the JSON structure that COCO tools expect.

The interesting bit

The real value is in the tedium it absorbs. COCO’s annotation schema is picky—segmentation formats, category IDs, image metadata all have to align just so. This wraps the conversion logic so you don’t have to read the COCO spec three times to figure out why your dataset loads as empty.

Key highlights

  • Generates both uncompressed RLE and polygon segmentations in COCO-compliant JSON
  • Companion blog post walks through creating a dataset from scratch
  • Published with a Zenodo DOI for citation
  • ~784 stars, suggesting it solved a real bottleneck for a fair number of CV practitioners

Caveats

  • Install instructions reference git:// protocol URLs, which GitHub disabled in 2022; you’ll likely need to swap to https:// or SSH
  • README is sparse—no API docs, no example code inline, just two screenshots and a link to an external blog post
  • Last tagged release (0.2.0) appears stale; check commit history if you need recent fixes

Verdict

Grab this if you’re building a custom COCO dataset and don’t want to write RLE encoding from scratch. Skip it if you need comprehensive documentation or active maintenance; in that case, pycocotools itself plus a careful reading of the format spec may serve you better.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.