Almanac
person

Lukas Petersson

personactiveprovisionallukas-petersson-8002e310·1 events·first seen 12d ago

Aliases: Lukas Petersson

Co-occurring entities

More like this (12)

Recent events (1)

5Latent Space·12d ago·source ↗

Andon Labs on building frontier evals: VendingBench and evaluating Claude models

Latent Space interviews Lukas Petersson and Axel Backlund of Andon Labs, the creators of VendingBench, about their approach to building real-world AI evaluations. The conversation covers their experience evaluating Claude models across the capability spectrum from Haiku to Mythos, and their methodology for constructing durable frontier evals. The episode is notable for touching on a speculative or unreleased Claude model tier called 'Mythos.'