product

slime

productactiveprovisionalslime-1e4d2b4d·1 events·first seen 11h ago

Aliases: slime

Co-occurring entities

THUDM

More like this (12)

Slake Unsloth Slack Nonslop SLS mksglu LIME Sandcastle Golem Spud chub smolagents

Recent events (1)

5Github Trending·11h ago·source ↗

THUDM releases slime: RL scaling post-training framework for LLMs

THUDM (Tsinghua University's Knowledge Engineering Group) has released slime, an open-source Python framework for LLM post-training via reinforcement learning scaling. The repository has accumulated 6,548 stars with 195 added in a single day, indicating significant community interest. RL-based post-training frameworks are a key area of active development following the success of techniques like GRPO and PPO in improving reasoning capabilities.

Agent and Tool Ecosystem Alignment and RLHF THUDM slime