Almanac
benchmark

diff hunk taxonomy benchmark

benchmarkactiveprovisionaldiff-hunk-taxonomy-benchmark-891b0c17·1 events·first seen 22d ago

Aliases: diff hunk taxonomy benchmark

Co-occurring entities

More like this (12)

Recent events (1)

4arXiv · cs.AI·22d ago·source ↗

Structure-Aware Code Change Labeling with LLMs via Two-Stage Taxonomy Pipeline

This paper presents a systematic study of using LLMs for taxonomy-based labeling of code diff hunks, going beyond summarization to assign structured labels capturing semantic attributes like renames, moves, and logic modifications. The authors introduce a two-stage pipeline combining diff-hunk labeling with structural refinement, using few-shot prompting to remain language-agnostic. Evaluated across four LLMs on a curated benchmark of natural and synthetic patches, the best configuration achieves 84% recall and 81% precision. Results suggest LLM-based structured labeling can complement static analysis tools in code review workflows.