← all repositories

ZhuiyiTechnology/roformer

A pre-trained MLM language model implementing rotary position embedding (RoPE) as an alternative relative position encoding for Transformers.

1.1k stars Python Language Models
roformer
Velocity · 7d
+0.6
★ / day
Trend
steady
star history

Rotary Transformer implements RoPE, a relative position encoding method where context embeddings (query and key) are multiplied by rotation matrices dependent on absolute position. The key property is that the inner product of context embeddings depends only on relative position, making it suitable for linear attentions. The implementation modifies the self-attention layer with minimal code changes using bert4keras.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.