Andrew Ng on Voice UI Architecture and the Vocal Bridge Developer Toolkit
Andrew Ng argues that voice-enabled UIs are underappreciated and will become pervasive, drawing on his experience adding voice to a personal app in under an hour using Claude Code. He describes a dual-agent architecture—a low-latency foreground conversational agent paired with a high-intelligence background agentic workflow—as the key to resolving the latency-vs-reliability tradeoff in voice AI. The piece highlights Vocal Bridge, an AI Fund portfolio company, as a developer tooling provider enabling this pattern. Hackathon examples include a clinical trial matcher and a conversational portfolio advisor built with the toolkit.
Related guides (4)
Related events (8)
DeepLearning.AI Launches AI Andrew: A Personality-Shaped AI Companion Built on Agentic Harness
Andrew Ng's team at DeepLearning.AI has released 'AI Andrew,' an AI companion designed to emulate Ng's communication style and personality for conversations about AI, careers, and learning. The system uses an agentic harness combining RAG, small and large models, guardrails, short- and long-term memory, and offline agentic loops that automatically propose system improvements. The team employed iterative error analysis to close the gap between AI Andrew's outputs and Ng's actual communication style, though acknowledged remaining issues including hallucinations. The product targets people seeking guidance on AI concepts, career decisions, and project ideas.
How OpenAI Delivers Low-Latency Voice AI at Scale
OpenAI published a technical overview of how it rebuilt its WebRTC stack to support real-time voice AI at global scale. The post covers infrastructure choices enabling low-latency audio delivery and conversational turn-taking. This represents a production-grade engineering disclosure about the systems underpinning OpenAI's voice products.
Navigating the challenges and opportunities of synthetic voices
OpenAI shares lessons from a small-scale preview of Voice Engine, a model capable of generating custom synthetic voices from a short audio sample. The post discusses both the technical capabilities and the safety/policy challenges associated with synthetic voice generation. OpenAI frames this as a cautious, staged rollout with safeguards to prevent misuse such as voice cloning fraud.
Anthropic launches Claude Mythos 5 and Claude Fable 5; Andrew Ng introduces OpenCoworker desktop agent
Anthropic released Claude Mythos 5 and Claude Fable 5, two variants of the same frontier model that set new state-of-the-art results across software engineering, knowledge work, cybersecurity, and agentic coding benchmarks. Claude Fable 5 is the general-availability version with safety classifiers that restrict responses on security, biology, chemistry, and cutting-edge AI topics, priced at $10/$50 per million input/output tokens; Mythos 5 is restricted to selected partners via Project Glasswing. Separately, Andrew Ng and collaborators released OpenCoworker, a free open-source desktop agent harness built on top of aisuite, designed to give users privacy-preserving agentic workflows with their own API keys or local models. The newsletter also contextualizes the broader shift toward LLM-driven agent harnesses as frontier models have become capable enough to reliably drive next-action decisions.
A New Framework for Evaluating Voice Agents (EVA)
ServiceNow AI has published a blog post on Hugging Face introducing EVA, a new evaluation framework designed specifically for voice agents. The framework appears to address gaps in existing evaluation methodologies for assessing voice-based AI agent performance. As voice agents become more prevalent in enterprise and consumer settings, standardized evaluation protocols are increasingly important for benchmarking progress.
Expanding on how Voice Engine works and our safety research
OpenAI published additional technical details about Voice Engine, its text-to-speech model capable of voice cloning from short audio samples. The post covers the underlying technology and safety research accompanying the system. Voice Engine has been in limited preview, with OpenAI citing concerns about misuse of voice cloning as a reason for controlled rollout.
Andon Labs on building frontier evals: VendingBench and evaluating Claude models
Latent Space interviews Lukas Petersson and Axel Backlund of Andon Labs, the creators of VendingBench, about their approach to building real-world AI evaluations. The conversation covers their experience evaluating Claude models across the capability spectrum from Haiku to Mythos, and their methodology for constructing durable frontier evals. The episode is notable for touching on a speculative or unreleased Claude model tier called 'Mythos.'
Advancing voice intelligence with new models in the API
OpenAI is releasing new realtime voice models via its API with capabilities spanning reasoning, translation, and transcription. The announcement targets developers building voice-enabled applications and represents an expansion of OpenAI's voice intelligence offerings beyond the existing Realtime API. The models are positioned to enable more natural and intelligent voice experiences in production deployments.



