6arXiv cs.AI (Artificial Intelligence)·17d ago

Humanoid-GPT: GPT-style Transformer trained on 2B-frame motion corpus for zero-shot humanoid control

Researchers introduce Humanoid-GPT, a causal Transformer pre-trained on a 2-billion-frame retargeted motion corpus that unifies major mocap datasets with large-scale in-house recordings for whole-body humanoid control. The model achieves zero-shot generalization to unseen motions and control tasks, overcoming the agility-generalization trade-off seen in prior MLP-based trackers. Scaling analyses demonstrate a new performance frontier for dynamic motion tracking without task-specific fine-tuning.

Frontier Model Releases Humanoid-GPT Humanoid-GPT: Scaling Data and Structure for Zero-Shot Motion Tracking

Related guides (1)

Frontier Model ReleasesTopic guide

Frontier Model Releases: The Race From Language to Action

Read asBeginner In-depth

Related events (8)

6Openai Blog·1mo ago·source ↗

Image GPT: Transformer Models Applied to Pixel Sequences for Image Generation and Classification

OpenAI demonstrates that a large transformer model trained autoregressively on pixel sequences can generate coherent image completions and samples, analogous to text generation. The work establishes a correlation between generative sample quality and downstream image classification accuracy. The best generative model achieves features competitive with top convolutional networks in the unsupervised setting, suggesting shared representational principles across modalities.

Frontier Model Releases Multimodal Progress Transformers convolutional neural network OpenAI +2 more

5Berkeley Ai Research (Bair) Blog·1mo ago·source ↗

PEVA: Whole-Body Conditioned Egocentric Video Prediction for Embodied World Models

Researchers from BAIR introduce PEVA (Predicting Ego-centric Video from human Actions), a model that generates first-person video frames conditioned on 48-dimensional whole-body kinematic pose trajectories. The model uses an autoregressive conditional diffusion transformer trained on the Nymeria dataset, which pairs real-world egocentric video with body pose capture. PEVA can generate atomic action videos, simulate counterfactuals, and support long video generation, representing a step toward world models grounded in physically embodied human agents.

Agent and Tool Ecosystem Multimodal Progress PEVA Conditional Diffusion Transformer Berkeley AI Research (BAIR)+2 more

5Openai Blog·1mo ago·source ↗

GPT-2: 6-Month Follow-Up — 774M Parameter Model Released

OpenAI released the 774 million parameter version of GPT-2 as part of its staged release strategy, following the 124M model in February and 355M model in May 2019. The release is accompanied by an open-source legal agreement to facilitate model-sharing partnerships between organizations. OpenAI also published a technical report on coordinating with the AI research community around publication norms and staged disclosure practices.

Frontier Model Releases Open Weights Progress GPT-2 124M GPT-2 OpenAI +2 more

9Openai Blog·1mo ago·source ↗

Improving Language Understanding with Unsupervised Learning (GPT-1)

OpenAI published the GPT-1 paper in June 2018, demonstrating state-of-the-art results across diverse language tasks by combining transformer architectures with unsupervised pre-training followed by supervised fine-tuning. The approach is task-agnostic and scalable, showing that pre-training on large unlabeled text corpora and then fine-tuning on specific tasks yields strong generalization. This work established the foundational paradigm that would evolve into GPT-2, GPT-3, and subsequent large language models.

Frontier Model Releases Open Weights Progress Transformers GPT-1 OpenAI +3 more

5Openai Blog·1mo ago·source ↗

GPT-2 1.5B Full Release Completes OpenAI's Staged Release Experiment

OpenAI released the full 1.5B parameter GPT-2 model along with code and weights, completing its staged release process that began earlier in 2019. The release also includes tooling to help detect GPT-2 outputs. OpenAI frames this as a test case for responsible staged release practices for future powerful models, acknowledging that larger models had already been released by others in the interim.

Open Weights Progress AI Safety Research GPT-2 OpenAI +1 more

9Openai Blog·1mo ago·source ↗

Introducing GPT-5.2

OpenAI has released GPT-5.2, described as their most advanced frontier model for professional use, featuring state-of-the-art reasoning, long-context understanding, coding, and vision capabilities. The model is available through ChatGPT and the OpenAI API. It is positioned to support faster and more reliable agentic workflows.

Long Context Evolution Frontier Model Releases GPT-5.2 ChatGPT OpenAI API +4 more

8Openai Blog·1mo ago·source ↗

Better language models and their implications

OpenAI announced GPT-2, a large-scale unsupervised language model capable of generating coherent multi-paragraph text and achieving state-of-the-art performance on language modeling benchmarks. The model demonstrated zero-shot capability across reading comprehension, machine translation, question answering, and summarization without task-specific fine-tuning. OpenAI notably withheld the full model release citing misuse concerns, marking an early high-profile instance of staged/responsible release policy.

Frontier Model Releases Evaluation and Benchmarking GPT-2 zero-shot learning unsupervised language modeling +3 more

6Hugging Face Blog·1mo ago·source ↗

Making LLMs lighter with AutoGPTQ and transformers

Hugging Face announces native integration of AutoGPTQ into the transformers library, enabling 4-bit quantized inference for large language models. The integration allows users to load and run GPTQ-quantized models directly through the standard transformers API with minimal code changes. This lowers the hardware barrier for deploying LLMs by significantly reducing VRAM requirements while maintaining competitive performance.

Open Weights Progress Inference Economics Transformers Hugging Face AutoGPTQ +2 more