Your camera as a pocket interpreter
A web app that looks at objects and tells you their names in other languages.

What it does
Thing Translator is a browser-based demo that uses your phone or laptop camera to identify objects, then speaks their names in a language you choose. It was built as one of Google’s AI Experiments. The live demo is still up if you want to point it at your coffee mug and hear what it’s called in Japanese.
The interesting bit
The project is essentially a tidy frontend wrapper around two Google Cloud APIs: Vision for object recognition and Translate for, well, translation. The cleverness is in the packaging — it turns API plumbing into a tangible, point-and-shoot experience that feels more like a toy than enterprise software.
Key highlights
- Live demo runs at thing-translator.appspot.com
- Built with JavaScript; dev server runs on port 9966 via
npm start - Requires Google Cloud API keys for Vision and Translate
- Production builds optimized with
npm run build - Part of Google’s AI Experiments showcase
Caveats
- The README is sparse on architecture details; it’s unclear how audio synthesis is handled
- You’ll need to bring your own GCP billing and API keys to run it locally
- The project appears to be a demo/experiment rather than maintained production code
Verdict
Worth a look if you’re building camera-based ML interfaces and want a minimal reference for wiring Vision + Translate together. Skip it if you need a maintained, batteries-included translation tool — this is glue code with a polished face.