Almanac
technique

Corpus-Grounded Feature Diffusion

techniqueactiveprovisionalcorpus-grounded-feature-diffusion-dc67f338·1 events·first seen 8d ago

Aliases: Corpus-Grounded Feature Diffusion

Co-occurring entities

More like this (12)

Recent events (1)

4arXiv · cs.CL·8d ago·source ↗

Corpus-Grounded Feature Diffusion pipeline for automated IEP generation in Traditional Chinese

Researchers propose a low-resource fine-tuning pipeline called Corpus-Grounded Feature Diffusion (CGFD) to automate Individualized Education Program (IEP) drafting from Traditional Chinese parent-teacher interview transcripts. The approach fine-tunes Breeze-7B with QLoRA on 582 synthetically diffused samples and uses schema-constrained decoding at inference time, finding that Grammar-Constrained Decoding is counterproductive under Traditional Chinese token budgets. On a small formal hold-out (n=10), the system achieves BERTScore F1 of 0.779, outperforming zero-shot GPT-5.4, DeepSeek-V3.2, Gemini-3-Flash-Preview, and Llama-4-Maverick baselines while enabling fully local, air-gapped inference. The work addresses a gap in Traditional Chinese special-education NLP and demonstrates a privacy-preserving deployment pattern for sensitive document generation.