organization
THUDM
organizationactiveprovisional
thudm-2a1ac2ff·1 events·first seen 11h agoAliases: THUDM
Co-occurring entities
More like this (12)
Recent events (1)
THUDM releases slime: RL scaling post-training framework for LLMs
THUDM (Tsinghua University's Knowledge Engineering Group) has released slime, an open-source Python framework for LLM post-training via reinforcement learning scaling. The repository has accumulated 6,548 stars with 195 added in a single day, indicating significant community interest. RL-based post-training frameworks are a key area of active development following the success of techniques like GRPO and PPO in improving reasoning capabilities.