The first time I watched an NX-OS engineer walk through a branch-collapse merge by hand, I started counting. By minute forty he had resolved nine of eighty-three conflicting files. Second monitor on the ticket, third on the diff, yellow legal pad with arrows on it. The arrows were the giveaway. When the tool you reach for is a legal pad, the tooling has lost.
The Conflict Resolver started as one question: what if the engineer's role wasn't to do the merge, but to approve the merge? The merge itself is something an LLM can reason about cleanly. The thing humans add isn't pattern matching; it's judgment about context that lives outside the diff.
The architecture
codedrop trigger
│
▼
git_conflicts_scan.pl ───► spawns background agent
│
▼
python orchestrator
├──► Codex CLI (per file)
▼
audit_log.json
▼
Node.js UI
▼
apply_resolution.py
V1 ships engineer-decision-only. V2 layers AI recommendations with override and training-data capture.
V2: confidence-calibrated recommendations
V2 adds an AI recommendation alongside each engineer decision — confidence score, one-sentence explanation, citation back to the part of the file that justified it. Engineer scans, overrides the wrong ones, approves the rest. Every override becomes training data.
The open problem is confidence calibration. A model that says "92% sure" needs to be wrong 8% of the time, no more and no less, or the engineers stop trusting the number — and once they stop trusting the number, they start re-reading every recommendation, and the loop is broken.