Almanac
company

Andon Labs

companyactiveprovisionalandon-labs-e3e7dea4·2 events·first seen 15d ago

Aliases: Andon Labs

Co-occurring entities

More like this (12)

Recent events (2)

5Latent Space·12d ago·source ↗

Andon Labs on building frontier evals: VendingBench and evaluating Claude models

Latent Space interviews Lukas Petersson and Axel Backlund of Andon Labs, the creators of VendingBench, about their approach to building real-world AI evaluations. The conversation covers their experience evaluating Claude models across the capability spectrum from Haiku to Mythos, and their methodology for constructing durable frontier evals. The episode is notable for touching on a speculative or unreleased Claude model tier called 'Mythos.'

5The Batch·15d ago·source ↗

Insurance Companies Carve Out AI Risk Exceptions; GPT-Rosalind, Claude Design, and Agentic Retail Deployments Highlighted

Major insurers including Berkshire Hathaway units, Travelers Group, and Chubb are excluding or restricting AI-related liability coverage, signaling growing concern over hard-to-model AI-driven claims. OpenAI introduced GPT-Rosalind, a domain-specific LLM fine-tuned for life sciences workflows, while Anthropic launched Claude Design for visual asset generation targeting non-designers. Additional items cover an AI-run San Francisco retail store exposing agentic system limitations, Wall Street banks cutting junior roles via AI deployment, and Anthropic's continued engagement with the Trump administration despite prior Pentagon restrictions.