Entity · other

Audio-LLM

otheractiveaudio-llm-83019c73·1 events·first seen Jun 1, 2026

Aliases: Audio-LLM

Co-occurring entities

UniAudio-Token Tencent Semantic-Acoustic Equilibrium Semantic-Acoustic Primitives

More like this (12)

SpeechLLM StreamingLLM Dep-LLM EvalLLM LLM Wiki Arabic LLMs SVD-LLM LLM-as-a-Judge vLLM RTLLM LLawCo LLM

Recent events (1)

6arXiv · cs.CL·Jun 1, 2026·source ↗

UniAudio-Token: Semantic Speech Tokenizer with General Audio Perception for Audio-LLMs

UniAudio-Token is a framework from Tencent that extends semantic speech tokenizers—commonly used as interfaces for Audio-LLMs—to support general audio perception without sacrificing speech quality. It introduces two mechanisms: Semantic-Acoustic Primitives (SAP) for structured supervision decomposing audio into linguistic, vocal, and auditory-scene components, and Semantic-Acoustic Equilibrium (SAE), a content-aware gating mechanism that restores fine-grained acoustic details from shallow layers. Evaluations show it outperforms all single-codebook baseline tokenizers on both understanding and generation tasks when integrated with downstream LLMs. Code, training/inference scripts, and model checkpoints are publicly released.

Agent and Tool Ecosystem Multimodal Progress Audio-LLM UniAudio-Token Tencent +2 more