ZhuiyiTechnology/roformer
A pre-trained MLM language model implementing rotary position embedding (RoPE) as an alternative relative position encoding for Transformers.

Velocity · 7d
+0.6
★ / day
Trend
→steady
star history
Rotary Transformer implements RoPE, a relative position encoding method where context embeddings (query and key) are multiplied by rotation matrices dependent on absolute position. The key property is that the inner product of context embeddings depends only on relative position, making it suitable for linear attentions. The implementation modifies the self-attention layer with minimal code changes using bert4keras.