Almanac
← Events
5MIT Technology Review — AI·16d ago

Courts grapple with surge of AI-generated legal filings from pro se litigants

MIT Technology Review reports on how federal courts are managing an influx of AI-generated documents submitted by pro se litigants who lack legal representation. The piece focuses on the practical challenges judges face in evaluating filings that may contain AI-generated hallucinations or procedural errors. This represents an emerging deployment pattern with significant implications for the legal system and AI accountability.

Related guides (2)

Related events (8)

4Ai Snake Oil·1mo ago·source ↗

AI Won't Automatically Make Legal Services Cheaper

This commentary applies an 'AI as Normal Technology' framework to analyze whether AI will reduce the cost of legal services. The piece argues against the assumption that AI-driven efficiency gains will automatically translate into lower prices for consumers in the legal sector. It examines structural and market factors that may prevent cost savings from being passed on, situating legal AI within a broader critique of AI hype.

6arXiv · cs.CL·3d ago·source ↗

LegalHalluLens: Typed hallucination auditing and calibrated multi-agent debate for legal AI

Researchers introduce LegalHalluLens, an auditing framework for hallucination in legal AI systems, evaluated across 510 contracts and 249,252 clause-level instances from the CUAD dataset. The framework introduces typed hallucination profiles across four claim categories (numeric, temporal, obligation/entitlement, factual) and a Risk Direction Index (RDI) that distinguishes omission from invention errors. A calibrated multi-agent debate pipeline reduces fabricated detections by 45% using a 4B-parameter model competitive with commercial APIs. The work reveals that aggregate hallucination rates (~52%) mask a 38-40 percentage-point gap between claim types and that two systems with identical aggregate rates can have opposite risk profiles.

4One Useful Thing·1mo ago·source ↗

Giving your AI a Job Interview

This commentary piece argues that as AI-generated advice becomes more consequential, users need systematic methods to evaluate AI reliability and quality—analogous to a job interview process. The author proposes frameworks for assessing AI outputs before trusting them for important decisions. The piece addresses the practical challenge of calibrating trust in AI systems across different use cases.

5arXiv · cs.CL·3d ago·source ↗

Benchmark gap paper: EU AI Act requires doctrinal legal reasoning evals that don't yet exist

A new arXiv preprint identifies a critical measurement gap in legal AI evaluation: existing benchmarks test paralegal and ancillary tasks rather than doctrinal legal reasoning, which is the interpretive core of legal work. The authors argue this gap is not merely methodological but legally significant, because the EU AI Act's 'appropriate accuracy' requirement for high-risk AI in the judicial domain cannot be operationalized without a doctrinal-reasoning benchmark. The paper proposes a benchmark framework aimed at filling this gap under EU AI Act compliance requirements.

3Simon Willison'S Weblog·1mo ago·source ↗

Your AI Use Is Breaking My Brain

Simon Willison comments on the phenomenon of AI-generated or AI-assisted content degrading the quality of online discourse and information environments. The piece reflects on how widespread AI use is affecting the experience of consuming internet content. This is a commentary piece from a prominent developer/blogger on the social and epistemic effects of AI proliferation.

5Import Ai·1mo ago·source ↗

Import AI 455: AI systems are about to start building themselves

Import AI issue 455 covers the emerging trend of AI systems automating AI research, framing it as a first step toward recursive self-improvement. The commentary synthesizes recent developments suggesting AI is beginning to participate meaningfully in its own development pipeline. As a tier-2 newsletter, this represents curated analysis of frontier AI research directions rather than primary reporting.

4Import Ai·1mo ago·source ↗

ImportAI 449: LLMs training other LLMs; 72B distributed training run; computer vision is harder than generative text

Import AI issue 449 covers several AI/ML developments including LLMs being used to train other LLMs, a 72B parameter distributed training run, and analysis of why computer vision remains harder than generative text. The newsletter also touches on potential political implications of AI progress. As a tier-2 commentary source, this aggregates and contextualizes multiple technical developments across the AI landscape.

5Hugging Face Blog·1mo ago·source ↗

Constitutional AI with Open LLMs

This Hugging Face blog post explores implementing Constitutional AI (CAI) techniques using open-weight language models. The post likely covers how to replicate Anthropic's CAI alignment methodology—using a set of principles to guide model self-critique and revision—without relying on proprietary systems. It represents a practical contribution to democratizing alignment research tooling.