Entity · model

Perception Encoder

modelactiveperception-encoder-af4005ca·1 events·first seen May 18, 2026

Aliases: Perception Encoder

Co-occurring entities

SAM Audio Judge Segment Anything Model 2 SAM Audio flow-matching diffusion transformer Segment Anything Playground SAM Audio-Bench Meta Perception Encoder Audiovisual

More like this (12)

Perception Encoder Audiovisual Perceiver IO VAE Encoder Positional Encoding Percepta Falcon Perception TF-Engram Imaginative Perception Tokens Perceptual Strength of Embodiment (PSE)video encoding HumanEval Voice Engine

Recent events (1)

7Meta Ai Blog·May 18, 2026·source ↗

Meta Introduces SAM Audio: Unified Multimodal Model for Audio Separation with PE-AV, Benchmark, and Judge Model

Meta has released SAM Audio, a unified multimodal audio separation model that accepts text, visual, and temporal span prompts to isolate sounds from complex audio mixtures. The system is powered by Perception Encoder Audiovisual (PE-AV), an extension of Meta's open-source Perception Encoder released earlier in 2025, and uses a flow-matching diffusion transformer architecture. Alongside the model, Meta is releasing SAM Audio-Bench (the first in-the-wild audio separation benchmark) and SAM Audio Judge (an automatic evaluation model for audio separation). All components are available today via the Segment Anything Playground.

Evaluation and Benchmarking Agent and Tool Ecosystem SAM Audio Judge Segment Anything Model 2 SAM Audio +7 more