Almanac
model

VibeThinker

modelactiveprovisionalvibethinker-240793a8·1 events·first seen 10h ago

Aliases: VibeThinker

Co-occurring entities

More like this (12)

Recent events (1)

6Hacker News·10h ago·source ↗

VibeThinker: 3B parameter model claims to beat Claude Opus 4.5 on reasoning via SFT+GRPO

A preprint on arXiv introduces VibeThinker, a 3-billion parameter model that reportedly outperforms Claude Opus 4.5 on reasoning benchmarks using a novel combination of supervised fine-tuning and Group Relative Policy Optimization (GRPO). The result, if reproducible, would be a notable efficiency milestone — a small open model matching or exceeding a frontier closed model on reasoning tasks. The HN post has attracted 191 points and 73 comments, indicating meaningful community interest.