technique
Steerable Model Merging
techniqueactiveprovisional
steerable-model-merging-9954bb91·1 events·first seen 2d agoAliases: Steerable Model Merging
More like this (12)
Model Merging / Weight InterpolationActivation SteeringAgentic Chain-of-Thought Steeringrepresentation-level steeringState-Conditioned Dynamic SteeringSafeSteersteering vectorsPredicting Future Behaviors in Reasoning Models Enables Better SteeringVisual Verification Enables Inference-time Steering and Autonomous Policy ImprovementAgentic Chain-of-Thought Steering for Efficient and Controllable LLM ReasoningGumbel noise steeringModel Collapse
Recent events (1)
Steerable Model Merging (ST-Merge) improves multilingual reasoning via adaptive gated cross-attention
Researchers propose ST-Merge, a framework for adaptively merging a multilingual model and a reasoning model using a gated cross-attention mechanism that weights each source model's contribution based on input characteristics. The approach addresses the limitation of static one-size-fits-all merging strategies that fail to resolve conflicts between source models. Experiments across 21 languages on four multilingual reasoning benchmarks show consistent improvements over strong baselines.