person
Lukas Petersson
personactiveprovisional
lukas-petersson-8002e310·1 events·first seen 12d agoAliases: Lukas Petersson
Co-occurring entities
More like this (12)
Recent events (1)
Andon Labs on building frontier evals: VendingBench and evaluating Claude models
Latent Space interviews Lukas Petersson and Axel Backlund of Andon Labs, the creators of VendingBench, about their approach to building real-world AI evaluations. The conversation covers their experience evaluating Claude models across the capability spectrum from Haiku to Mythos, and their methodology for constructing durable frontier evals. The episode is notable for touching on a speculative or unreleased Claude model tier called 'Mythos.'