Entity · technique

Semantic-Acoustic Primitives

techniqueactivesemantic-acoustic-primitives-c95fe39c·1 events·first seen Jun 1, 2026

Aliases: Semantic-Acoustic Primitives

Co-occurring entities

Audio-LLM UniAudio-Token Tencent Semantic-Acoustic Equilibrium

More like this (12)

Semantic-Acoustic Equilibrium Audio Interaction Model Primitive Modulated Structure Representation (PMSR)Fast Adaptive Semantic Entropy Hierarchical Acoustic-Semantic Modeling: Modality Separation and Semantic Coherence for Full-Duplex SLMs Auditing Protocol-Level Shortcuts in Large Audio Language Model Judges for Speech Evaluation Acoustic Cue Alignment in Audio Language Models for Speech Emotion Recognition Unified Audio Intelligence Without Regressing on Text Intelligence Automatic Speech Recognition SAM Audio q0: Primitives for Hyper-Epoch Pretraining The Anatomy of the CTC Oracle Gap: Acoustic Exhaustion and Linguistic Recovery

Recent events (1)

6arXiv · cs.CL·Jun 1, 2026·source ↗

UniAudio-Token: Semantic Speech Tokenizer with General Audio Perception for Audio-LLMs

UniAudio-Token is a framework from Tencent that extends semantic speech tokenizers—commonly used as interfaces for Audio-LLMs—to support general audio perception without sacrificing speech quality. It introduces two mechanisms: Semantic-Acoustic Primitives (SAP) for structured supervision decomposing audio into linguistic, vocal, and auditory-scene components, and Semantic-Acoustic Equilibrium (SAE), a content-aware gating mechanism that restores fine-grained acoustic details from shallow layers. Evaluations show it outperforms all single-codebook baseline tokenizers on both understanding and generation tasks when integrated with downstream LLMs. Code, training/inference scripts, and model checkpoints are publicly released.

Agent and Tool Ecosystem Multimodal Progress Audio-LLM UniAudio-Token Tencent +2 more