Is anystyle open source?

Yes — inukshuk/anystyle is an open-source project tracked on heatdrop.

What language is anystyle written in?

inukshuk/anystyle is primarily written in Ruby.

How popular is anystyle?

inukshuk/anystyle has 1.3k stars on GitHub.

Where can I find anystyle?

inukshuk/anystyle is on GitHub at https://github.com/inukshuk/anystyle.

← all repositories

inukshuk/anystyle

A citation parser that learned to read bibliographies so you don't have to

AnyStyle uses machine learning to turn messy reference strings into structured data, with a focus on letting you train it on your own weird formatting conventions.

★1.3k stars Ruby Data Tooling

View on GitHub ↗ Homepage ↗

Not currently ranked — collecting fresh signals.

star history

What it does AnyStyle parses free-form bibliographic references—think copy-pasted citation strings from PDFs or web pages—into structured fields like author, title, date, and publisher. It ships as a Ruby gem, a CLI tool, and powers the web app at anystyle.io. The parser handles a Derrida citation in French as readily as an English journal article, extracting language and script metadata along the way.

The interesting bit The project doesn’t pretend one model fits all citation styles. It exposes training pipelines so you can build custom models on your own annotated data, checking quality against held-out “gold” sets. The default model is trained on a manually curated corpus, but the README is admirably upfront about its skew: 965 English references versus 54 French and a grab bag of others. They practically beg you to retrain if you’re working outside Anglophone science publishing.

Key highlights

CLI, Ruby API, and open-source web interface (anystyle.io)
Custom model training with anystyle train and quality checking via sequence/token error rates
Supports Latin scripts broadly, plus Cyrillic; explicitly incompatible with Chinese, Japanese, Arabic, and Indian languages that don’t whitespace-separate tokens
Pluggable dictionary backends: in-memory Ruby hash, GDBM, or Redis
BSD-licensed, volunteer-maintained since 2011

Caveats

The default training data is heavily English-biased; non-English results may need custom models
Finder model training data is partially withheld due to copyright restrictions
No candidate images available in the repository

Verdict Worth a look if you’re building bibliographic pipelines, cleaning up reference dumps, or need citation parsing you can retrain for domain-specific formats. Skip it if you’re processing CJK or Arabic script natively, or if you need a Python-native solution—this is Ruby territory.

Frequently asked

What is inukshuk/anystyle?: AnyStyle uses machine learning to turn messy reference strings into structured data, with a focus on letting you train it on your own weird formatting conventions.
Is anystyle open source?: Yes — inukshuk/anystyle is an open-source project tracked on heatdrop.
What language is anystyle written in?: inukshuk/anystyle is primarily written in Ruby.
How popular is anystyle?: inukshuk/anystyle has 1.3k stars on GitHub.
Where can I find anystyle?: inukshuk/anystyle is on GitHub at https://github.com/inukshuk/anystyle.