← all repositories
jind11/TextFooler

BERT's kryptonite: swapping words until it breaks

A 2019 adversarial attack that fools text classifiers by replacing words with semantically similar synonyms—no model retraining required.

TextFooler
Velocity · 7d
+0.2
★ / day
Trend
steady
star history

What it does

TextFooler generates adversarial examples for text classification and natural language inference models. It picks words in an input sentence, finds synonyms via counter-fitted word embeddings, and swaps them until the target model (BERT, LSTM, CNN) changes its prediction—while keeping the sentence meaning intact to human readers.

The interesting bit

The attack is entirely black-box: no access to model gradients or architecture needed, just query access. It uses Universal Sentence Encoder to filter synonym candidates, ensuring semantic similarity without requiring a human in the loop. The paper’s title asks “Is BERT Really Robust?"—spoiler, the answer was no.

Key highlights

  • Works against BERT, LSTM, and CNN classifiers on 7 datasets
  • Pre-computed cosine similarity matrices speed up synonym lookup
  • Includes pre-trained target model parameters and generated adversarial examples for direct benchmarking
  • Supports both text classification and NLI tasks with separate scripts (attack_classification.py, attack_nli.py)
  • Published code for a 2019 ICLR paper with ~530 stars

Caveats

  • Setup requires installing a separate esim package manually and downloading ~1GB of counter-fitted embeddings
  • README is sparse on how the attack actually selects which words to perturb; you’ll need the paper for algorithmic details
  • Google Drive links for datasets and models may rot over time

Verdict

Worth a look if you’re building NLP defenses or benchmarking model robustness—this was an influential early attack. Skip it if you need something production-ready; the tooling is research-grade and the field has moved on to more sophisticated attacks.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.