Almanac
← Events
4arXiv cs.AI (Artificial Intelligence)·47h ago

G2Rec: Scalable framework unifying graph-based user modeling with semantic tokenization for generative recommendation

Researchers propose G2Rec, a framework that combines holistic graph-based user co-engagement modeling with semantic tokenization for industrial-scale generative recommendation systems. The approach addresses limitations of existing methods—scalability issues in graph serialization and lack of supervision in semantic tokenization—by learning user interest prototypes without ground-truth labels. The system has been deployed in production across product surfaces and evaluated on public datasets, showing improvements over prior methods.

Related guides (1)

Related events (8)

5arXiv · cs.CL·15d ago·source ↗

OneReason: Activating Chain-of-Thought Reasoning in Generative Recommendation Models

Researchers from the OneRec team introduce OneReason, a framework for enabling reasoning capabilities in generative recommendation models deployed across short-video, live-streaming, advertising, and e-commerce. The work identifies a key failure mode — that naive thinking-mode integration does not outperform non-thinking baselines — and diagnoses this as a deficit in two factors: itemic token perception and user behavior cognition. The proposed solution combines perception-focused pre-training, a three-level cognition-enhanced CoT format for supervised fine-tuning, and a specialize-then-unify RL training recipe.

4arXiv · cs.CL·10d ago·source ↗

GenAIR: LLM-grounded archetype representations improve sequential recommendation

GenAIR is a framework that uses LLMs to infer 'archetype' profiles of items' ideal target audiences, generating richer item embeddings for sequential recommendation systems. A behavioral calibration objective aligns these semantic embeddings with actual user interaction patterns, closing the gap between language-space representations and real-world behavior. Experiments on three datasets show consistent improvements over state-of-the-art baselines across multiple sequential recommendation models.

6arXiv · cs.AI·1mo ago·source ↗

Semantic Generative Tuning (SGT) for Unified Multimodal Models

This paper introduces Semantic Generative Tuning (SGT), a post-training paradigm for unified multimodal models (UMMs) that bridges the gap between visual understanding and visual generation. The authors find that image segmentation tasks serve as optimal generative proxies, providing structural semantics that improve both perception and generative layout fidelity. SGT aligns representation spaces across understanding and generation objectives, improving feature linear separability and visual-textual attention allocation. Evaluations show consistent gains on multimodal comprehension and generative fidelity benchmarks.

6arXiv · cs.CL·17d ago·source ↗

Taiji: Pareto Optimal Policy Optimization for LLM-enhanced recommendation at Kuaishou scale

Researchers from Kuaishou present Taiji, an LLM-as-Enhancer framework for industrial recommender systems that addresses two bottlenecks: generating high-quality chain-of-thought data via reverse-engineered reasoning and rejection sampling during SFT, and balancing semantic vs. ID-based rewards during RL alignment via a new algorithm called Pareto Optimal Policy Optimization (POPO). The system has been deployed on Kuaishou's advertising platform since May 2026, serving over 400 million daily users. The paper contributes both a practical deployment case study and a novel RL alignment technique for the LLM4Rec paradigm.

4arXiv · cs.CL·11d ago·source ↗

N-GRPO: Semantic Neighbor Mixing for Improved Policy Optimization in LLM Reasoning

A new arXiv preprint introduces N-GRPO, an exploration strategy for the GRPO reinforcement learning framework that improves solution diversity during rollout by mixing embeddings of anchor tokens with their nearest semantic neighbors rather than using token-level sampling or random noise. The method is evaluated on DeepSeek-R1-Distill-Qwen models of various sizes and shows consistent improvements on math reasoning benchmarks plus out-of-distribution generalization. The work targets a known limitation in RLHF-style training: redundant rollout trajectories that reduce effective learning signal.

5arXiv · cs.LG·26d ago·source ↗

Good Token Hunting: Token Selection Framework for Visual Geometry Transformers

This paper introduces a two-stage token selection framework to address the quadratic computational scaling of global attention in visual geometry transformers used for multi-view 3D reconstruction. The approach combines diversity-based inter-frame selection (frame-level) with entropy-guided intra-frame sparsification (token-level within frames). Experiments demonstrate over 85% acceleration for 500-image scenes while maintaining or improving baseline reconstruction quality, offering a favorable speed-accuracy trade-off.

5Hugging Face Blog·1mo ago·source ↗

Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective

A Hugging Face blog post authored by LinkedIn describes practical lessons from implementing reinforcement learning training for agentic open-source GPT-class models. The retrospective covers engineering and algorithmic challenges encountered when applying RL to agentic workflows. As a tier-2 source with no body content available, the depth and specific findings cannot be fully assessed, but the topic sits at the intersection of agentic systems and RLHF/RL training pipelines.

6Anthropic News·17d ago·source ↗

Anthropic introduces Contextual Retrieval to reduce RAG retrieval failures by up to 67%

Anthropic published a technical method called Contextual Retrieval that combines Contextual Embeddings and Contextual BM25 to address the context-loss problem in traditional RAG pipelines. The approach prepends chunk-level context before encoding, reducing failed retrievals by 49% standalone and 67% when combined with reranking. The post also highlights prompt caching as a simpler alternative for knowledge bases under 200K tokens, and provides a cookbook for deployment with Claude.