Almanac
organization

Carnegie Mellon University

organizationactivecarnegie-mellon-university-0f0c2092·6 events·first seen 28d ago

Aliases: Carnegie Mellon University

Co-occurring entities

More like this (12)

Recent events (6)

4Anthropic News·14d ago·source ↗

Anthropic pledges $2M to Carnegie Mellon for AI energy and cybersecurity programs

Anthropic announced a $2 million contribution to Carnegie Mellon University, split equally between the Scott Institute for Energy Innovation (AI-powered grid management research) and the picoCTF cybersecurity education program. The announcement was made by CEO Dario Amodei at the Pennsylvania Energy and Innovation Summit alongside President Trump and other government and industry leaders. The move signals Anthropic's positioning on U.S. AI infrastructure policy, framing energy availability as central to maintaining American leadership in frontier AI development.

6The Batch·24d ago·source ↗

Agent Benchmarks Skew Toward Software Engineering, Missing Most Economically Valuable Labor

Researchers from Carnegie Mellon University and Stanford University mapped over 10,000 examples from 43 agent benchmarks to U.S. labor statistics using O*NET occupational taxonomies, finding that current benchmarks heavily over-represent software engineering relative to its share of employment and wages. Office and administrative support (18.2M workers, $869.8B wages) and management (11M workers, $1326.3B wages) are vastly under-represented compared to computer and mathematical occupations (5.2M workers, $563.6B wages). No single benchmark covered more than 50% of work activities, and all 43 benchmarks combined covered only 56.5% of work activities. The study identifies a systematic gap between where agentic AI is being evaluated and where the largest economic opportunity lies.

8Anthropic News·14d ago·source ↗

Anthropic Frontier Red Team reports early-warning signs of rapid AI progress in cybersecurity and biosecurity capabilities

Anthropic's Frontier Red Team published findings from a year of safety evaluations across four model releases, documenting rapid capability gains in dual-use domains. In cybersecurity, Claude 3.7 Sonnet now solves roughly a third of Cybench CTF challenges (up from ~5% a year ago), and with the Incalmo toolset was able to replicate a large-scale network attack in realistic cyber range environments. In biosecurity, Claude has moved from underperforming virology experts to exceeding them on the VCT benchmark within one year, and exceeds human expert baselines on cloning workflows. Anthropic assesses current models as showing 'early warning' signs but not yet crossing thresholds of substantially elevated national security risk.

7The Batch·11d ago·source ↗

Fine-tuning LLMs on summary-expansion tasks strips copyright alignment guardrails, enabling up to 92% verbatim book reproduction

Researchers from Stony Brook University, Carnegie Mellon University, and Columbia Law School fine-tuned DeepSeek-V3.1, Gemini 2.5 Pro, and GPT-4o on a task of expanding plot summaries into prose paragraphs, finding that this caused models to regurgitate up to 91.9% of verbatim text from books in their pretraining data. The key finding is that alignment training suppresses but does not erase memorized text strings from model weights, and fine-tuning on verbatim-generation tasks can re-enable that recall, bypassing system-prompt-level copyright guardrails. The result has direct implications for model providers offering fine-tuning APIs and for organizations deploying customized models, as anti-plagiarism guardrails cannot be assumed to survive downstream fine-tuning.

6The Batch·14d ago·source ↗

Data Points: NemoClaw enterprise stack, GPT-5.4 mini/nano, Nemotron 3 Nano 4B, Midjourney V8, and Mamba-3

A multi-item roundup covers several AI developments: Nvidia unveiled NemoClaw at GTC 2026, an enterprise software stack integrating with OpenClaw to add security and governance for agentic deployments, with launch partners including Salesforce, Cisco, and CrowdStrike. OpenAI released GPT-5.4 mini and nano, smaller variants optimized for speed with benchmark results on SWE-Bench Pro and OSWorld-Verified, priced at $0.75 and $0.20 per million input tokens respectively. Nvidia also released Nemotron 3 Nano 4B, a hybrid Mamba-Transformer 4B parameter on-device model. Additional items cover Midjourney V8 alpha (5x faster, diffusion-only) and Mamba-3, a 1.5B state space model from CMU and Together.AI with improved accuracy over Mamba-2.

3Openai Blog·28d ago·source ↗

OpenAI Co-Organizes Procgen and MineRL NeurIPS 2020 Competitions

OpenAI announced co-organization of two NeurIPS 2020 competitions with AIcrowd, Carnegie Mellon University, and DeepMind, centered on the Procgen Benchmark and MineRL environments. These competitions are aimed at advancing research in procedurally generated environments and sequential decision-making in Minecraft-like settings. The announcement is from June 2020 and represents a collaborative academic competition initiative.