Entity · model

RWKV

modelactiverwkv-10c3f159·2 events·first seen May 19, 2026

Aliases: RWKV

Co-occurring entities

Diffusion Language Models B³D-RWKV triplet-block layout bidirectional attention Transformers Recurrent Neural Network Hugging Face

More like this (12)

RWKU RLVR B³D-RWKV KV Cache RL² CARV VRRL Ecom-RLVE KVPress veRL SnapKV VerbatimRAG

Recent events (2)

6arXiv · cs.CL·May 26, 2026·source ↗

Triplet-Block Diffusion RWKV: Unifying Linear-Time Causal Models with Bidirectional Discrete Diffusion

The paper introduces B³D-RWKV, a 7.2B-parameter language model that combines RWKV's O(L) linear-time inference with parallel bidirectional discrete diffusion via a triplet-block layout. This architecture resolves the fundamental tension between causal (unidirectional) and diffusion (bidirectional) attention requirements. On an 8-task evaluation suite, B³D-RWKV-7.2B achieves comparable accuracy to existing models while delivering an average 1.6× decoding throughput speedup over baselines.

Frontier Model Releases Inference Economics Diffusion Language Models B³D-RWKV RWKV +2 more

5Hugging Face Blog·May 19, 2026·source ↗

Introducing RWKV - An RNN with the advantages of a transformer

Hugging Face introduces RWKV, a recurrent neural network architecture that claims to combine the parallelizable training of transformers with the efficient linear-time inference of RNNs. The model avoids the quadratic attention bottleneck of standard transformers while maintaining competitive performance. RWKV represents an alternative architectural direction to the dominant transformer paradigm for language modeling.

Frontier Model Releases Open Weights Progress Transformers Recurrent Neural Network Hugging Face +2 more