promptfoo: open-source LLM testing and red-teaming framework trending on GitHub
promptfoo is a TypeScript-based open-source tool for testing prompts, agents, and RAG pipelines, with built-in red-teaming and vulnerability scanning capabilities. It supports declarative configs with CLI and CI/CD integration and benchmarks across major models including GPT, Claude, Gemini, and DeepSeek. The project has accumulated 22,323 stars with 46 added today, and claims usage by OpenAI and Anthropic.
Related guides (4)
Related events (8)
prompt-optimizer: Open-Source TypeScript Prompt Optimization Tool
prompt-optimizer is an open-source TypeScript tool designed to help users write better prompts and improve AI outputs. The repository has accumulated 29,603 total stars with 76 new stars today, indicating sustained community interest. It represents a category of tooling focused on prompt engineering automation and optimization.
OpenAI to Acquire Promptfoo
OpenAI announced the acquisition of Promptfoo, an AI security platform focused on identifying and remediating vulnerabilities in AI systems during development. The acquisition signals OpenAI's intent to deepen its enterprise security capabilities. Promptfoo has been widely used by developers to red-team and evaluate LLM applications for safety and reliability issues.
Meta releases Llama Prompt Guard 2 (86M) for prompt injection and jailbreak detection
Meta released Llama Prompt Guard 2-86M, a DeBERTa-v2-based text classification model on Hugging Face designed for safety filtering, specifically prompt injection and jailbreak detection. The model is tagged with llama4, suggesting it is part of the Llama 4 safety tooling ecosystem. With over 122K downloads, it has seen meaningful early adoption.
Langfuse: Open Source LLM Engineering Platform Trending on GitHub
Langfuse is an open-source LLM engineering platform providing observability, metrics, evaluations, prompt management, and dataset tooling. It integrates with OpenTelemetry, LangChain, OpenAI SDK, and LiteLLM. The project has accumulated 28,075 GitHub stars with 89 new stars today, indicating sustained community traction. Backed by Y Combinator (W23), it represents a notable entry in the LLM ops/tooling ecosystem.
oh-my-pi: Terminal AI Coding Agent with Hash-Anchored Edits and LSP Integration
oh-my-pi is an open-source TypeScript AI coding agent designed for terminal use, featuring hash-anchored file edits, an optimized tool harness, LSP integration, Python execution, browser access, and subagent support. The project has accumulated 5,362 GitHub stars with 237 added today, indicating rapid community traction. It represents a self-contained agentic coding environment targeting developer workflows in the terminal.
Meta releases Llama Prompt Guard 2 (22M) safety classifier on Hugging Face
Meta released Llama Prompt Guard 2-22M, a lightweight 22-million-parameter text classification model for prompt safety, published on Hugging Face under the meta-llama organization. The model is based on DeBERTa-v2 architecture and tagged for safety use cases including prompt injection and jailbreak detection. It is part of the Llama 4 safety tooling ecosystem and supports English and French.
FastGPT: open-source knowledge-base platform with RAG and visual workflow orchestration
FastGPT is an open-source TypeScript platform for building knowledge-based question-answering systems on top of LLMs, featuring data processing pipelines, RAG retrieval, and a visual AI workflow editor. The project has accumulated 28,533 GitHub stars with modest daily growth (+65), indicating steady community traction. It targets developers who want to deploy RAG-based QA systems without extensive configuration.
GPT-5.4 released with tool search, computer use, and frontier benchmark performance
OpenAI released GPT-5.4 in Thinking and Pro variants, featuring an expanded context window (up to 1.05M input tokens), native computer use, tool search capabilities, and adjustable reasoning levels. In independent testing by Artificial Analysis, GPT-5.4 Pro at xhigh reasoning achieved state-of-the-art on GDP-Val-AA, BrowseComp, Terminal-Bench-Hard, SWE-Bench-Pro, and MCP Atlas, while trailing Gemini 3.1 Pro Preview on MMMU-Pro and Humanity's Last Exam. Pricing is set at the top of the market ($30/$180 per million input/output tokens for Pro), and the release also powers Codex, OpenAI's competitor to Claude Code. The item is reported via The Batch (tier 2 commentary) and includes additional context on Andrew Ng's chub CLI tool for agent documentation sharing.



