BERT learns to read spreadsheets
Google Research's TAPAS lets you ask natural-language questions of structured tables without generating SQL or logical forms.

What it does TAPAS is a transformer-based model that takes a table and a natural-language question, then predicts which cells to select or aggregate to produce an answer. It treats table QA as an end-to-end classification problem over cell coordinates and aggregation operators, skipping the traditional intermediate step of generating a query language like SQL.
The interesting bit The model encodes the table directly into the transformer by adding positional embeddings that track row and column indices, plus a binary mask for which cells are numeric. This lets BERT-like attention operate over flattened table tokens as if they were sentences, which is either elegant or horrifying depending on your feelings about spreadsheets.
Key highlights
- Pre-trained on 6.2M table-text pairs from Wikipedia, then fine-tuned on WikiSQL, WTQ, SQA, and TabFact
- Released in multiple sizes from TINY to LARGE, with and without per-cell position index resetting
- Also supports table entailment (TabFact) and open-domain table retrieval via dense passage retrieval extensions
- Available in Hugging Face transformers since v4.1.1 with 28 checkpoints and a live widget
- Multiple Colab notebooks provided for trying predictions on GPU/TPU without local setup
Caveats
- Requires
protoccompiler installed before pip install due to protocol buffer dependencies - Self-reported accuracy metrics are medians over three runs using their own evaluation tool, not official task metrics
- The WTQ dev accuracy tops out around 51% even for the largest model, suggesting tables remain genuinely hard
Verdict Worth exploring if you’re building natural-language interfaces to databases or documents with embedded tables. Skip it if you need guaranteed exact SQL generation or your tables are small enough that a traditional query builder suffices.