nc-ffn-735a84e3·1 events·first seen Aliases: NC-FFN
Researchers propose replacing the standard transformer feed-forward sublayer with explicit fuzzy set operations (intersection and set-difference), creating a negation-capable FFN (NC-FFN) whose hidden units carry interpretable logical form. At 125M scale on OpenWebText, NC-FFN matches GELU baseline perplexity while remaining legible by construction. Adding soft sequence quantifiers with learned forgetting rates recovers grammatical licensing deficits and produces units that detectably fire on grammatical licensors (comparatives, passive participles, negative-polarity items) without dictionary learning. The work advances mechanistic interpretability by providing a parameter-neutral architecture whose computations are readable as grammatical mechanisms.