Is bilm-tf open source?

Yes — allenai/bilm-tf is open source, released under the Apache-2.0 license.

What language is bilm-tf written in?

allenai/bilm-tf is primarily written in Python.

How popular is bilm-tf?

allenai/bilm-tf has 1.6k stars on GitHub.

Where can I find bilm-tf?

allenai/bilm-tf is on GitHub at https://github.com/allenai/bilm-tf.

← all repositories

allenai/bilm-tf

The original ELMo: when "contextualized" was still novel

TensorFlow 1.2 implementation of the bidirectional language model that made word embeddings depend on their neighbors.

★1.6k stars Python Language Models ML Frameworks

View on GitHub ↗

Not currently ranked — collecting fresh signals.

star history

What it does

This is AllenAI’s reference implementation of ELMo—deep contextualized word representations—built in TensorFlow. It trains bidirectional language models and exposes three inference modes: raw character-level encoding (slowest, most general), cached token embeddings with biLSTM context (middle ground), or pre-computing entire datasets to HDF5 (fastest, most rigid).

The interesting bit

The three-speed design is the practical hook. The README is refreshingly honest about trade-offs: character-level for unseen test data, cached tokens for fixed vocabularies like SNLI, full pre-computation when you want to escape TensorFlow entirely. It’s a snapshot of 2018 NLP engineering before transformers swallowed everything.

Key highlights

Ships with pre-trained English models and the exact 1 Billion Word Benchmark setup used in the original paper
Character-based inputs mean OOV words still get representations, though with “a slight decrease in run time”
Training script bin/train_elmo.py documents the original hyperparameters: 3 GTX 1080s, 10 epochs, ~two weeks
Includes checkpoint-to-HDF5 conversion for interoperability with AllenNLP’s PyTorch implementation
Docker image available, but requires nvidia-docker—no CPU fallback

Caveats

Locked to TensorFlow 1.2, which is now archaeological; the README itself nudges users toward TensorFlow Hub or AllenNLP for new work
Vocabulary file must be sorted by descending token frequency with exact <S>, </S>, <UNK> placement—brittle convention, easy to botch
Fine-tuning on small corpora (<10M tokens) risks overfitting; the authors explicitly warn against training too long

Verdict

Worth studying if you’re tracing the evolution of contextual embeddings or reproducing 2018 baselines. For production use, follow the README’s own advice and use the TensorFlow Hub or AllenNLP versions instead.

Frequently asked

What is allenai/bilm-tf?: TensorFlow 1.2 implementation of the bidirectional language model that made word embeddings depend on their neighbors.
Is bilm-tf open source?: Yes — allenai/bilm-tf is open source, released under the Apache-2.0 license.
What language is bilm-tf written in?: allenai/bilm-tf is primarily written in Python.
How popular is bilm-tf?: allenai/bilm-tf has 1.6k stars on GitHub.
Where can I find bilm-tf?: allenai/bilm-tf is on GitHub at https://github.com/allenai/bilm-tf.