← all repositories
IBM/pytorch-seq2seq

IBM's seq2seq kit: alpha-grade scaffolding for PyTorch

A modular training-and-inference framework for sequence-to-sequence models, built before Transformers ate the world.

1.5k stars Python ML FrameworksLanguage Models
pytorch-seq2seq
Velocity · 7d
+0.5
★ / day
Trend
steady
star history

What it does

pytorch-seq2seq is a Python framework that packages encoder-decoder training, inference, checkpointing, and vocabulary handling into reusable, swappable components. It targets the classic RNN-based seq2seq era — the README explicitly plans to add CNN and Transformer architectures later, but they are not present yet.

The interesting bit

The project shipped with a literal “reverse a list of numbers” toy example as its flagship demo. That is either charmingly honest or a warning sign, depending on your patience level. The checkpointing scheme is pleasantly opinionated: experiments are organized by timestamp with separate encoder, decoder, and model files.

Key highlights

  • Modular encoder/decoder design intended for easy swapping
  • Checkpoint resume and experiment directory structure built-in
  • Pre-trained word embedding support added in v0.1.6
  • PyTorch 0.4 compatibility (the README’s “What’s New” banner)
  • Vagrant-based dev environment and TravisCI + Codacy integration for testing and style enforcement

Caveats

  • Explicitly labeled an “alpha release” by the authors
  • Benchmarks section lists only “WMT Machine Translation (Coming soon)” — no actual benchmark results are shown
  • Transformer and CNN architectures are roadmap items, not implemented features
  • Requires installation from source; no PyPI package mentioned

Verdict

Worth a look if you are maintaining legacy seq2seq code or teaching the fundamentals of encoder-decoder architectures. Skip it if you need production-ready Transformers or modern LLM tooling — this is a 2017-vintage codebase that time has largely passed by.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.