4GitHub Trending (AI/LLM filtered)·3d ago

promptfoo: open-source LLM testing and red-teaming framework trending on GitHub

promptfoo is a TypeScript-based open-source tool for testing prompts, agents, and RAG pipelines, with built-in red-teaming and vulnerability scanning capabilities. It supports declarative configs with CLI and CI/CD integration and benchmarks across major models including GPT, Claude, Gemini, and DeepSeek. The project has accumulated 22,323 stars with 46 added today, and claims usage by OpenAI and Anthropic.

AI Safety Research Agent and Tool Ecosystem DeepSeek V4 Promptfoo OpenAI Anthropic

Related guides (4)

OpenAI

OpenAI: The Lab That Made AI a Household Word

Read asBeginner In-depth

DeepSeek V4

DeepSeek V4: The Open-Weights Giant Reshaping AI Economics

Read asBeginner In-depth

Anthropic

Anthropic: The AI Safety Company at the Center of the Frontier

Read asBeginner

AI Safety ResearchTopic guide

AI Safety Research: From Lab Evals to Geopolitical Flashpoint

Read asIn-depth

Related events (8)

3Github Trending·28d ago·source ↗

prompt-optimizer: Open-Source TypeScript Prompt Optimization Tool

prompt-optimizer is an open-source TypeScript tool designed to help users write better prompts and improve AI outputs. The repository has accumulated 29,603 total stars with 76 new stars today, indicating sustained community interest. It represents a category of tooling focused on prompt engineering automation and optimization.

Agent and Tool Ecosystem linshenkx prompt-optimizer

7Openai Blog·1mo ago·source ↗

OpenAI to Acquire Promptfoo

OpenAI announced the acquisition of Promptfoo, an AI security platform focused on identifying and remediating vulnerabilities in AI systems during development. The acquisition signals OpenAI's intent to deepen its enterprise security capabilities. Promptfoo has been widely used by developers to red-team and evaluate LLM applications for safety and reliability issues.

AI Safety Research Enterprise Deployment Patterns Promptfoo OpenAI +1 more

5Meta Llama·11d ago·source ↗

Meta releases Llama Prompt Guard 2 (86M) for prompt injection and jailbreak detection

Meta released Llama Prompt Guard 2-86M, a DeBERTa-v2-based text classification model on Hugging Face designed for safety filtering, specifically prompt injection and jailbreak detection. The model is tagged with llama4, suggesting it is part of the Llama 4 safety tooling ecosystem. With over 122K downloads, it has seen meaningful early adoption.

Frontier Model Releases AI Safety Research Hugging Face Llama Prompt Guard 2-86M DeBERTa-v3 +1 more

4Github Trending·24d ago·source ↗

Langfuse: Open Source LLM Engineering Platform Trending on GitHub

Langfuse is an open-source LLM engineering platform providing observability, metrics, evaluations, prompt management, and dataset tooling. It integrates with OpenTelemetry, LangChain, OpenAI SDK, and LiteLLM. The project has accumulated 28,075 GitHub stars with 89 new stars today, indicating sustained community traction. Backed by Y Combinator (W23), it represents a notable entry in the LLM ops/tooling ecosystem.

Evaluation and Benchmarking Agent and Tool Ecosystem OpenTelemetry Langfuse Y Combinator +3 more

4Github Trending·1mo ago·source ↗

oh-my-pi: Terminal AI Coding Agent with Hash-Anchored Edits and LSP Integration

oh-my-pi is an open-source TypeScript AI coding agent designed for terminal use, featuring hash-anchored file edits, an optimized tool harness, LSP integration, Python execution, browser access, and subagent support. The project has accumulated 5,362 GitHub stars with 237 added today, indicating rapid community traction. It represents a self-contained agentic coding environment targeting developer workflows in the terminal.

Agent and Tool Ecosystem can1357 Language Server Protocol oh-my-pi

5Meta Llama·11d ago·source ↗

Meta releases Llama Prompt Guard 2 (22M) safety classifier on Hugging Face

Meta released Llama Prompt Guard 2-22M, a lightweight 22-million-parameter text classification model for prompt safety, published on Hugging Face under the meta-llama organization. The model is based on DeBERTa-v2 architecture and tagged for safety use cases including prompt injection and jailbreak detection. It is part of the Llama 4 safety tooling ecosystem and supports English and French.

Frontier Model Releases AI Safety Research Hugging Face Llama Prompt Guard 2-86M DeBERTa-v3 +1 more

3Github Trending·2d ago·source ↗

FastGPT: open-source knowledge-base platform with RAG and visual workflow orchestration

FastGPT is an open-source TypeScript platform for building knowledge-based question-answering systems on top of LLMs, featuring data processing pipelines, RAG retrieval, and a visual AI workflow editor. The project has accumulated 28,533 GitHub stars with modest daily growth (+65), indicating steady community traction. It targets developers who want to deploy RAG-based QA systems without extensive configuration.

Enterprise Deployment Patterns Agent and Tool Ecosystem labring FastGPT

8The Batch·17d ago·source ↗

GPT-5.4 released with tool search, computer use, and frontier benchmark performance

OpenAI released GPT-5.4 in Thinking and Pro variants, featuring an expanded context window (up to 1.05M input tokens), native computer use, tool search capabilities, and adjustable reasoning levels. In independent testing by Artificial Analysis, GPT-5.4 Pro at xhigh reasoning achieved state-of-the-art on GDP-Val-AA, BrowseComp, Terminal-Bench-Hard, SWE-Bench-Pro, and MCP Atlas, while trailing Gemini 3.1 Pro Preview on MMMU-Pro and Humanity's Last Exam. Pricing is set at the top of the market ($30/$180 per million input/output tokens for Pro), and the release also powers Codex, OpenAI's competitor to Claude Code. The item is reported via The Batch (tier 2 commentary) and includes additional context on Andrew Ng's chub CLI tool for agent documentation sharing.

Frontier Model Releases Inference Economics DeepLearning.AI Artificial Analysis Intelligence Index Claude Opus 4.6 +14 more