Entity · benchmark

diff hunk taxonomy benchmark

benchmarkactivediff-hunk-taxonomy-benchmark-891b0c17·1 events·first seen May 26, 2026

Aliases: diff hunk taxonomy benchmark

Co-occurring entities

few-shot prompting code review automation LLM-based code change labeling pipeline

More like this (12)

hunk harness-level benchmarks CORE benchmark temporally grounded QA benchmark Human-Vehicle Interaction Benchmark DPG Benchmark SPOT benchmark Bias Benchmark for Question Answering DevDataBench human alignment benchmarks (perceptual similarity, gloss, robustness, shape-texture)DSIT-Taxonomies RepoBench

Recent events (1)

4arXiv · cs.AI·May 26, 2026·source ↗

Structure-Aware Code Change Labeling with LLMs via Two-Stage Taxonomy Pipeline

This paper presents a systematic study of using LLMs for taxonomy-based labeling of code diff hunks, going beyond summarization to assign structured labels capturing semantic attributes like renames, moves, and logic modifications. The authors introduce a two-stage pipeline combining diff-hunk labeling with structural refinement, using few-shot prompting to remain language-agnostic. Evaluated across four LLMs on a curated benchmark of natural and synthetic patches, the best configuration achieves 84% recall and 81% precision. Results suggest LLM-based structured labeling can complement static analysis tools in code review workflows.

Enterprise Deployment Patterns Agent and Tool Ecosystem few-shot prompting code review automation diff hunk taxonomy benchmark +1 more