Entity · model

LlamaGuard

modelactivellamaguard-62d73cd0·2 events·first seen May 19, 2026

Aliases: LlamaGuard, LlamaGuard 3, LlamaGuard 4

Co-occurring entities

ModernBERT AILuminate ShieldGemma SorryBench JailbreakBench StrongReject Ettin CyberSecEval 2 Hugging Face Meta

More like this (12)

Llama Guard 4 Llama Llama Guard 3 11B Vision Llama Guard 3 1B Llama Prompt Guard 2-86M Llama 2 Code Llama TinyLlama Llama 3 Llama-3 Llama 4 Scout AprielGuard

Recent events (2)

5arXiv · cs.CL·Jun 25, 2026·source ↗

Systematic comparison of encoder vs. decoder safety judges for LLM adversarial evaluation

A new arXiv preprint evaluates whether fine-tuned encoder classifiers from the ModernBERT family (ModernBERT and Ettin) can replace LLM-based safety judges for detecting harmful outputs in user-model conversations. The study benchmarks encoders against rule-based methods, fine-tuned LLM classifiers, and LLM judges including LlamaGuard 3/4, ShieldGemma, StrongReject, and Claude-as-a-judge across multiple adversarial attack types. Results are reported on F1, false negative rate, and precision-recall, with breakdowns by attack technique, providing practical guidance on cost-latency tradeoffs for production safety pipelines.

Evaluation and Benchmarking Inference Economics ModernBERT AILuminate LlamaGuard +6 more

5Hugging Face Blog·May 19, 2026·source ↗

CyberSecEval 2 - A Comprehensive Evaluation Framework for Cybersecurity Risks and Capabilities of Large Language Models

CyberSecEval 2 is a benchmark framework designed to evaluate both the cybersecurity risks and capabilities of large language models. The framework appears to be hosted or featured on Hugging Face's leaderboard infrastructure, extending prior cybersecurity evaluation work. It assesses LLMs across multiple dimensions of security-relevant behavior, including potential for misuse and defensive capabilities.

Evaluation and Benchmarking AI Safety Research CyberSecEval 2 LlamaGuard Hugging Face +1 more