Skip to content

Variant Definitions

This directory contains only V8-hybrid — the variant proposed in PR #189.

All other variants (V1–V7) were used during the evaluation but are not included in this PR to keep the diff reviewable. They are published at:

ascerra/code-agent-eval-scenarios — variants/

Variant inventory

IDNameBrowseDescription
V1fullsend-single-skillviewPR #189 original — agent + single skill + scan-secrets
V2fullsend-multi-skillviewPR #189 with skill split into 4 pieces
V3vanilla-claudeviewNo guardrails baseline — just a prompt
V4claudemd-onlyviewCLAUDE.md instructions only — stopped early
V5apexviewEnhanced V1 with reasoning protocol, self-review, minimal diff
V6apex-githubviewV5 hardcoded for GitHub (rigidity hurts)
V7ultimateviewV5 + "understand before you act" + reproduction step
V8hybridhereCleaned V1 + V5 minimal-diff + V7 reproduction/task-type