University of California San Diego
university-of-california-san-diego-23b25753·3 events·first seen 18d agoAliases: University of California San Diego, UC San Diego
Co-occurring entities
More like this (12)
Recent events (3)
Study finds state media in training data causes LLMs to reflect government propaganda in native languages
Researchers from University of Oregon, Purdue, UCSD, NYU, and Princeton found that state-controlled media is heavily overrepresented in web-scraped training datasets, causing Claude 3 Sonnet and GPT-4o to express significantly more favorable attitudes toward authoritarian governments when prompted in those governments' native languages. Chinese state media accounts for over 40x more documents in CulturaX than Chinese Wikipedia, and both models reproduced state-media strings at 3-5% rates. When prompted in Chinese, both models favored China's government roughly 68-75% of the time versus English prompts on the same topics, with the effect scaling with a country's World Press Freedom Index ranking.
Test-Time Training End-to-End (TTT-E2E) Retrains Model Weights to Handle Long Inputs
Researchers from Astera Institute, Nvidia, Stanford, UC Berkeley, and UC San Diego introduced TTT-E2E, a method that compresses long context into transformer weights by training the model during inference via meta-learning. The approach uses sliding-window attention restricted to 8,000 tokens and updates only the fully connected layers of the last quarter of the network on each 1,000-token chunk at inference time, keeping per-token generation latency roughly constant as context scales to 128,000 tokens. TTT-E2E slightly outperforms vanilla transformers on next-token prediction loss across long contexts and matches efficient architectures like Mamba 2 and Gated DeltaNet on inference speed, but fails dramatically on Needle-in-a-Haystack retrieval beyond 8,000 tokens and incurs substantially higher training latency. The work reframes long-context handling as a training-inference trade-off rather than an architectural design problem.
Meta Research Improves Image Generation via Staged Planning and Self-Revision Fine-Tuning
Researchers from Meta and collaborating universities propose a fine-tuning method that teaches image generators to compose images through discrete plan-sketch-inspect-refine cycles rather than generating all at once. Starting from BAGEL-7B, they construct ~62,000 training examples using GPT-4o and FLUX.1 Kontext to supervise each stage, achieving 83% on GenEval versus 77% for the base model and a competing method (PARM) that required 11x more training data and ~8x more inference steps. The approach improves spatial relationship accuracy, object attribute fidelity, and real-world knowledge grounding in generated images.