Almanac
technique

Steerable Model Merging

techniqueactiveprovisionalsteerable-model-merging-9954bb91·1 events·first seen 2d ago

Aliases: Steerable Model Merging

More like this (12)

Recent events (1)

4arXiv · cs.CL·2d ago·source ↗

Steerable Model Merging (ST-Merge) improves multilingual reasoning via adaptive gated cross-attention

Researchers propose ST-Merge, a framework for adaptively merging a multilingual model and a reasoning model using a gated cross-attention mechanism that weights each source model's contribution based on input characteristics. The approach addresses the limitation of static one-size-fits-all merging strategies that fail to resolve conflicts between source models. Experiments across 21 languages on four multilingual reasoning benchmarks show consistent improvements over strong baselines.