3Hugging Face Blog·1mo ago

Training CodeParrot from Scratch

Hugging Face published a detailed walkthrough of training CodeParrot, a GPT-2-style language model trained from scratch on GitHub code data. The post covers dataset preparation, tokenizer training, model configuration, and distributed training setup using the Accelerate library. It serves as both a technical tutorial and a demonstration of open-source code generation model development practices circa late 2021.

Training Infrastructure Open Weights Progress GitHub Code Dataset CodeParrot GPT-2 Accelerate Hugging Face

Related guides (3)

Hugging Face

Hugging Face: The Home of Open-Source AI

Read asBeginner In-depth

Open Weights ProgressTopic guide

Open Weights Progress: How Freely Available AI Models Caught Up to the Frontier

Read asBeginner In-depth

Training InfrastructureTopic guide

Training Infrastructure: The Compute Arms Race Powering Modern AI

Read asBeginner In-depth

Related events (8)

4Hugging Face Blog·1mo ago·source ↗

Personal Copilot: Train Your Own Coding Assistant

This Hugging Face blog post walks through fine-tuning an open-weights code model to create a personalized coding assistant. It covers dataset preparation, training techniques (likely LoRA/PEFT), and deployment considerations for self-hosted code completion. The post targets practitioners who want a GitHub Copilot-like experience without relying on proprietary APIs.

Open Weights Progress Agent and Tool Ecosystem PEFT LoRA Hugging Face +1 more

3Hugging Face Blog·1mo ago·source ↗

Pre-Train BERT with Hugging Face Transformers and Habana Gaudi

This Hugging Face blog post from August 2022 describes how to pre-train a BERT model from scratch using the Hugging Face Transformers library on Habana Gaudi hardware accelerators. It covers the full pipeline including data preparation, tokenizer training, and masked language modeling pretraining. The post serves as both a technical tutorial and a demonstration of Habana Gaudi's viability as an alternative AI training accelerator.

Training Infrastructure Habana Gaudi Hugging Face Transformers Hugging Face +2 more

3Hugging Face Blog·1mo ago·source ↗

Training a Language Model with Hugging Face Transformers Using TensorFlow and TPUs

This Hugging Face blog post provides a technical walkthrough for training a language model using TensorFlow and Google TPUs via the Transformers library. It covers the practical setup, data pipeline, and training configuration required to leverage TPU hardware with the TF ecosystem. The post serves as a tutorial bridging Hugging Face tooling with TPU-based infrastructure.

Training Infrastructure Agent and Tool Ecosystem Google TPU Hugging Face Transformers Hugging Face +1 more

5Hugging Face Blog·1mo ago·source ↗

Creating a Coding Assistant with StarCoder

This Hugging Face blog post describes the process of building StarChat-Alpha, a conversational coding assistant fine-tuned from the StarCoder large language model. The post covers the instruction-tuning methodology used to adapt StarCoder for chat-style interactions, including dataset preparation and training details. It represents an early example of open-weights coding LLMs being adapted into assistant-style deployments.

Open Weights Progress Agent and Tool Ecosystem BigCode Hugging Face StarCoder2 +2 more

5Openai Blog·1mo ago·source ↗

Coding and Design with GPT-5

OpenAI published a blog post highlighting GPT-5's capabilities in coding and design workflows. The post appears to be a use-case showcase demonstrating how GPT-5 enables new possibilities in these domains. As a Tier 1 source announcement, it signals continued OpenAI promotion of GPT-5 for developer and creative audiences. Specific technical details are not provided in the body excerpt.

Frontier Model Releases Agent and Tool Ecosystem OpenAI GPT-5.5

8Openai Blog·1mo ago·source ↗

Better language models and their implications

OpenAI announced GPT-2, a large-scale unsupervised language model capable of generating coherent multi-paragraph text and achieving state-of-the-art performance on language modeling benchmarks. The model demonstrated zero-shot capability across reading comprehension, machine translation, question answering, and summarization without task-specific fine-tuning. OpenAI notably withheld the full model release citing misuse concerns, marking an early high-profile instance of staged/responsible release policy.

Frontier Model Releases Evaluation and Benchmarking GPT-2 zero-shot learning unsupervised language modeling +3 more

8Openai Blog·1mo ago·source ↗

Introducing GPT-5.3-Codex-Spark

OpenAI has announced GPT-5.3-Codex-Spark, described as their first real-time coding model. It offers 15x faster generation compared to prior coding models and supports a 128k context window. The model is currently available in research preview for ChatGPT Pro subscribers.

Long Context Evolution Frontier Model Releases OpenAI GPT-5.3-Codex-Spark ChatGPT Pro +2 more

3Hugging Face Blog·1mo ago·source ↗

Deploy GPT-J 6B for Inference Using Hugging Face Transformers and Amazon SageMaker

This Hugging Face blog post provides a tutorial for deploying the GPT-J 6B open-weights language model on Amazon SageMaker using the Hugging Face Transformers library. It covers the infrastructure and tooling steps needed to serve a large language model in a managed cloud environment. The post reflects early 2022 patterns for productionizing open-weight models via cloud ML platforms.

Open Weights Progress Inference Economics Amazon SageMaker Hugging Face Transformers Hugging Face +3 more