Almanac
technique

DPO

techniqueactivedpo-0d8f2bdd·3 events·first seen 1mo ago

Aliases: DPO

Co-occurring entities

More like this (12)

Recent events (3)

6Hugging Face Blog·1mo ago·source ↗

TRL v1.0: Post-Training Library Built to Move with the Field

Hugging Face has released TRL v1.0, a major milestone for its post-training library focused on reinforcement learning from human feedback and related alignment techniques. The release signals a stabilization of the API and feature set after iterative development tracking the rapidly evolving post-training landscape. TRL is widely used in the open-source community for fine-tuning and aligning language models using methods such as PPO, DPO, and GRPO.

5arXiv · cs.CL·8d ago·source ↗

AdvGRPO: Stable co-training framework for adaptive red teaming of language models

Researchers introduce AdvGRPO, a co-training framework that makes GRPO viable for joint attacker-defender optimization in LLM red teaming, addressing previously reported instability. The method uses dense multi-channel rewards and decoupled advantage normalization, with a curriculum progressing from single-turn to multi-turn attacks before bootstrapping co-training. Co-trained defenders outperform baselines on safety benchmarks, and the attacks show transferability across models.

5Github Trending·4d ago·source ↗

ms-swift: ModelScope framework for fine-tuning 600+ LLMs and 300+ MLLMs

ms-swift is an open-source Python framework from ModelScope supporting PEFT and full-parameter fine-tuning methods (CPT, SFT, DPO, GRPO) across 600+ LLMs and 300+ multimodal LLMs, including Qwen3, DeepSeek, Llama4, and others. The project has accumulated 14,487 GitHub stars and was accepted at AAAI 2025. It serves as a broad-coverage training harness for the current generation of open-weights frontier models.