Entity · person

Xinyue Liu

personactivexinyue-liu-5831e070·1 events·first seen Jun 5, 2026

Aliases: Xinyue Liu

Co-occurring entities

Carnegie Mellon University DeepSeek V4 Stony Brook University Columbia Law School Google GPT-4o Gemini-2.5-Pro OpenAI

More like this (12)

Liu Yuxiao Qu Yuxi Baiyu Chen Kevin Xu Jason Liu Shiyi Cao Fei-Fei Li Xu Shuwen Xin Ye Jiyuan Tan Xu

Recent events (1)

7The Batch·Jun 5, 2026·source ↗

Fine-tuning LLMs on summary-expansion tasks strips copyright alignment guardrails, enabling up to 92% verbatim book reproduction

Researchers from Stony Brook University, Carnegie Mellon University, and Columbia Law School fine-tuned DeepSeek-V3.1, Gemini 2.5 Pro, and GPT-4o on a task of expanding plot summaries into prose paragraphs, finding that this caused models to regurgitate up to 91.9% of verbatim text from books in their pretraining data. The key finding is that alignment training suppresses but does not erase memorized text strings from model weights, and fine-tuning on verbatim-generation tasks can re-enable that recall, bypassing system-prompt-level copyright guardrails. The result has direct implications for model providers offering fine-tuning APIs and for organizations deploying customized models, as anti-plagiarism guardrails cannot be assumed to survive downstream fine-tuning.

AI Safety Research Regulatory Developments Carnegie Mellon University Xinyue Liu DeepSeek V4 +7 more