4Hugging Face Blog·1mo ago

NPC-Playground: A 3D Environment for LLM-Powered Non-Player Characters

Hugging Face, Gigax, and Cubzh have introduced NPC-Playground, a 3D interactive environment where users can interact with non-player characters powered by large language models. The project demonstrates real-time LLM inference applied to game NPC behavior and dialogue. This represents a practical application of LLMs in interactive entertainment and agent-like character simulation.

Enterprise Deployment Patterns Agent and Tool Ecosystem Gigax NPC-Playground Cubzh Hugging Face

Related guides (3)

Hugging Face

Hugging Face: The Home of Open-Source AI

Read asBeginner In-depth

Enterprise Deployment PatternsTopic guide

Enterprise Deployment Patterns: From AI Demo to Production Reality

Read asBeginner In-depth

Agent and Tool EcosystemTopic guide

Agent and Tool Ecosystem: How AI Is Learning to Act, Not Just Answer

Read asBeginner In-depth

Related events (8)

4arXiv · cs.LG·5d ago·source ↗

Persona-Pruner: framework for sculpting lightweight persona-specific LLM sub-networks via structured pruning

Persona-Pruner is a pruning framework that isolates persona-specific sub-networks from a generalist language model given only a character description, producing lightweight role-playing models without the full model's computational cost. The authors observe that naive pruning degrades role-playing fidelity by failing to distinguish redundant knowledge from character-essential parameters. On RoleBench, Persona-Pruner reduces performance drop by up to 93.8% relative to the strongest baseline pruning method while preserving general LLM capabilities. The work targets practical deployment scenarios such as game ecosystems with many simultaneous NPC agents.

Inference Economics Agent and Tool Ecosystem RoleBench Persona-Pruner

6arXiv · cs.CL·12d ago·source ↗

Agentopia: Long-term multi-agent life simulation framework for training LLMs on social behavior

Researchers introduce Agentopia, a framework for simulating 10 years of social life across 100 LLM-powered agents, enabling study of emergent social behaviors and long-term personal growth dynamics. The system defines a 'life reward' metric mirroring human well-being and uses it to train LLMs via rejection sampling. Training on simulated social experience yields a +15.6% improvement on downstream role-playing benchmarks, suggesting that synthetic social simulation can generalize to real capability gains.

Agent and Tool Ecosystem Alignment and RLHF Agentopia Agentopia: Long-Term Life Simulation and Learning in Agent Societies

4Hugging Face Blog·1mo ago·source ↗

TextQuests: How Good are LLMs at Text-Based Video Games?

A Hugging Face blog post introduces TextQuests, an evaluation framework that tests LLMs on text-based video games as a proxy for interactive reasoning, planning, and language understanding. The benchmark assesses how well models can navigate, solve puzzles, and maintain state across multi-turn interactions in classic interactive fiction environments. This type of evaluation targets agentic capabilities including long-horizon planning and grounded language understanding.

Evaluation and Benchmarking Agent and Tool Ecosystem TextQuests Hugging Face

4Openai Blog·1mo ago·source ↗

OpenAI Releases Neural MMO: Massively Multiagent RL Game Environment

OpenAI released Neural MMO, a massively multiagent game environment designed for reinforcement learning research. The platform supports a large and variable number of agents operating within a persistent, open-ended task structure. The environment is designed to encourage emergent behaviors including better exploration, divergent niche formation, and improved overall agent competence through multi-species competition.

Evaluation and Benchmarking Agent and Tool Ecosystem Reinforcement Learning OpenAI Neural MMO

8Hugging Face Blog·1mo ago·source ↗

GGML and llama.cpp Join Hugging Face to Ensure Long-Term Progress of Local AI

GGML and llama.cpp, the foundational open-source libraries enabling efficient local inference of large language models, are joining Hugging Face. This move is intended to secure long-term development and sustainability of the projects that underpin much of the local/on-device AI ecosystem. The acquisition or integration represents a significant consolidation of key open-weights inference infrastructure under the Hugging Face umbrella.

Open Weights Progress Inference Economics Georgi Gerganov llama.cpp Hugging Face +2 more

4arXiv · cs.CL·5d ago·source ↗

LoSoNA benchmark evaluates LLM adaptation to implicit local social norms in group chats

Researchers introduce LoSoNA, a benchmark for testing whether LLM-based agents can infer and adapt to unstated local conversational norms in multi-party chat scenarios. Each scenario presents a group-chat transcript where non-subject participants implicitly demonstrate a hidden norm, followed by an elicitor turn. Eight frontier and open-weight models are evaluated under four prompting conditions; naive prompting performs poorly for most models, while explicit norm-aware prompting yields uneven gains—Gemini 3.1 Pro reaches 84.2% and Claude Fable 5 reaches 81.6%. The work contributes to growing interest in evaluating LLM social and pragmatic capabilities beyond factual or reasoning tasks.

Evaluation and Benchmarking Agent and Tool Ecosystem Gemini 3.1 Pro Claude Fable 5 LoSoNA

5arXiv · cs.CL·19d ago·source ↗

If LLMs Have Human-Like Attributes, Then So Does Age of Empires II

This paper critiques the widespread practice of ascribing anthropomorphic attributes (e.g., morality, language understanding) to LLMs, arguing that such conclusions are empirically non-unique. The authors demonstrate this by training a neural network on Age of Empires II and showing that similar attribute-ascription logic could apply to arbitrary substrates like LEGO or urban infrastructure. They propose a 'null assumption' of LLM non-uniqueness as a methodological baseline for experiments, and prove that Age of Empires II is functionally- and Turing-complete as a supporting argument.

Evaluation and Benchmarking AI Safety Research Age of Empires II large language models anthropomorphism in AI +2 more

4Hugging Face Blog·1mo ago·source ↗

Deploy LLMs with Hugging Face Inference Endpoints

Hugging Face published a guide on deploying large language models using their Inference Endpoints service. The post covers how to set up scalable, production-ready LLM deployments with minimal infrastructure overhead. It targets developers looking to move from experimentation to hosted inference without managing raw compute.

Inference Economics Enterprise Deployment Patterns Hugging Face Inference Endpoints Hugging Face