Yes — pedropro/TACO is open source, released under the MIT license.

What language is TACO written in?

pedropro/TACO is primarily written in Jupyter Notebook.

pedropro/TACO has 746 stars on GitHub.

Where can I find TACO?

pedropro/TACO is on GitHub at https://github.com/pedropro/TACO.

pedropro/TACO

A dataset for teaching AI to spot litter in the wild

TACO provides manually segmented images of trash on roads, beaches, and in woods to train object detection models that can find garbage where humans left it.

★746 stars Jupyter Notebook Computer Vision Data Tooling

View on GitHub ↗ Homepage ↗

Not currently ranked — collecting fresh signals.

star history

What it does

TACO is a dataset and toolkit for training object detection models to identify litter in outdoor environments. It bundles images from Flickr with pixel-level segmentation annotations in COCO format, plus scripts to download data, split train/val/test sets, and run a modified Mask R-CNN implementation. The project also hosts a web tool at tacodataset.org for collecting more crowd-sourced annotations.

The interesting bit

The taxonomy is hierarchical and the class distribution is heavily imbalanced—most categories have very few examples—so the authors provide pre-built class maps that collapse rare trash types into dominant ones like cans, bottles, and plastic bags. You can also define your own groupings. The “unofficial” annotations submitted via the website are kept separate and unvetted, which is a refreshingly honest way to handle crowd data.

Key highlights

Annotations follow standard COCO format, so it plugs into existing detection pipelines with minimal friction
Includes a working Mask R-CNN detector fork in /detector with dataset splitting and config scripts
Images are hosted on Flickr, not in the repo itself; download.py fetches them on demand
Provides both official reviewed annotations and a separate annotations_unofficial.json for crowd submissions
Paper and citation info available at arXiv:2003.06975

Caveats

The dataset is “still relatively small” per the authors’ own admission
Most original classes have very few annotations, forcing you to merge or drop categories
Requires separate installation of the COCO Python API to run the demo notebook
Unofficial annotations are explicitly flagged as potentially inaccurate or poorly segmented

Verdict

Worth a look if you’re building litter-detection models for drones, beach-cleaning robots, or environmental monitoring. Skip it if you need a large, balanced dataset out of the box—this is a growing community effort, not a finished product.

Frequently asked

What is pedropro/TACO?: TACO provides manually segmented images of trash on roads, beaches, and in woods to train object detection models that can find garbage where humans left it.
Is TACO open source?: Yes — pedropro/TACO is open source, released under the MIT license.
What language is TACO written in?: pedropro/TACO is primarily written in Jupyter Notebook.
How popular is TACO?: pedropro/TACO has 746 stars on GitHub.
Where can I find TACO?: pedropro/TACO is on GitHub at https://github.com/pedropro/TACO.