← all repositories
LeeSureman/Flat-Lattice-Transformer

A transformer that flattens Chinese word lattices without the combinatorial mess

Research code for FLAT, an ACL 2020 paper that rethinks how to feed Chinese word-segmentation ambiguities into a transformer without exploding the sequence length.

1k stars Python Language ModelsML Frameworks
Flat-Lattice-Transformer
Velocity · 7d
+0.4
★ / day
Trend
steady
star history

What it does

FLAT tackles Chinese Named Entity Recognition by encoding a “flat lattice” — essentially all possible word spans from a lexicon — as a single sequence with position-aware attention, rather than building an actual lattice graph. The repo contains two variants: V0 without BERT and V1 with BERT, plus later memory-optimized versions (V2’s tensor.unique() deduplication and V3’s scalar position encoding). You point it at OntoNotes, MSRA, Weibo, or Resume datasets after downloading gigaword character/bigram embeddings and one of two word embedding sets.

The interesting bit

The cleverness is in the framing: instead of wrestling with graph neural networks over word lattices, FLAT treats lattice nodes as a flat sequence and uses relative position encoding to preserve span boundary information. The 2022 update then squeezes memory dramatically — Flat_scalar drops from 8.4GB to 1.3GB at 300-token sequences — by replacing the full relative position matrix with scalars.

Key highlights

  • ACL 2020 paper implementation with reproducible scripts for four standard Chinese NER datasets
  • Two BERT integration modes (V0 bare, V1 BERT-augmented) plus later memory-optimized variants
  • Explicit memory benchmarks showing 6× reduction with scalar encoding vs. original
  • Built on FastNLP 0.5.0 (somewhat dated stack: Python 3.7, PyTorch 1.2)
  • fitlog integration for experiment tracking, though it’s opt-in

Caveats

  • Dependency versions are frozen in 2019; expect friction with modern PyTorch/CUDA
  • Pretrained embeddings require manual download from Google Drive or Baidu Pan, then path configuration in paths.py
  • README is bilingual but thin on architecture details — you’ll need the paper for actual understanding

Verdict

Worth a look if you’re doing Chinese NER research or need a baseline that handles word segmentation ambiguity without GNN complexity. Skip it if you want a maintained, pip-installable library; this is paper reproduction code with a dusty dependency stack.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.