← all repositories

EleutherAI/gpt-neo

An implementation of GPT-2 and GPT-3-style transformer language models using mesh-tensorflow with support for training and inference on TPU and GPU.

8.3k stars Python Language Models
gpt-neo
Velocity · 7d
+3.8
★ / day
Trend
steady
star history

Implements model and data parallel GPT-3-style language models using the mesh-tensorflow library. Supports training and inference on TPU and GPU with extensions beyond standard GPT-3 including local attention, linear attention, mixture of experts, and axial positional embeddings. Provides pretrained models (1.3B and 2.7B parameters) trained on The Pile dataset. The project is archived in favor of the GPU-focused GPT-NeoX repository.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.