Entity · model

Phi-4-mini

modelactivephi-4-mini-3d9c04e6·2 events·first seen May 21, 2026

Aliases: Phi-4-mini

Co-occurring entities

Multi-Source Cybersecurity Logs: An ATT&CK-Labeled Dataset and SLM Evaluation CICIDS Llama 3.2 LoRA Qwen2.5-1.5B MITRE ATT&CK UNSW-NB15 Atlas RLVR ROUGE-L AIME24 GRPO AIME25 PPO Qwen3-4B GPQA Diamond Qwen3-1.7B LamPO MATH-500

More like this (12)

o4-mini GPT-4o mini GPT-4.1 mini Phi-2 GPT-5.4 mini Mini-R1 o3-mini GPT-4b micro o1-mini Palmyra-mini Phi-3.5 o4-mini-high

Recent events (2)

5arXiv · cs.LG·Jun 17, 2026·source ↗

Multi-source cybersecurity log dataset with ATT&CK labels and SLM fine-tuning evaluation

Researchers introduce a new multi-source cybersecurity log dataset of 870 sessions (~2.3M events) capturing system, network, and browser activity on Windows endpoints, with per-entry MITRE ATT&CK technique labels across 12 tactics and 53 techniques. The dataset addresses gaps in existing public datasets (CICIDS, UNSW-NB15, ATLAS) that lack combined multi-source coverage with fine-grained ATT&CK labeling. Three small language models (Qwen2.5-1.5B, Llama-3.2-3B, Phi-4-Mini) were fine-tuned with LoRA on the dataset, achieving chunk classification accuracy of 90–97% versus ~8% for base variants, though ATT&CK technique identification remained harder at 42% exact-match accuracy.

Evaluation and Benchmarking AI Safety Research Multi-Source Cybersecurity Logs: An ATT&CK-Labeled Dataset and SLM Evaluation CICIDS Llama 3.2 +6 more

5arXiv · cs.CL·May 21, 2026·source ↗

LamPO: Lambda-Style Policy Optimization with Pairwise Decomposed Advantage for Reasoning LMs

LamPO proposes a new RLVR training objective that replaces GRPO's scalar group-relative advantages with a Pairwise Decomposed Advantage, aggregating pairwise reward gaps within response groups and weighting comparisons by confidence-aware log-probability differences. The method retains a critic-free, clipped-update PPO-style structure and optionally adds a ROUGE-L-based dense auxiliary reward to reduce sparsity. Experiments on AIME24, AIME25, MATH-500, and GPQA-Diamond using Qwen3-1.7B, Qwen3-4B, and Phi-4-mini show consistent improvements over GRPO and other RLVR variants with more stable training dynamics.

Frontier Model Releases Evaluation and Benchmarking RLVR ROUGE-L AIME24 +10 more