Joseph RwandaJoseph Rwanda
HomeWorkAboutResume
Hire me
Back to work
Live
2026
Featured

WaybillAgent

Walk the warehouse, Claude does the audit.

WaybillAgent transforms warehouse auditing from a multi-day manual process into an AI-assisted guided walk using phone capture and agentic reconciliation—flagship build for Anthropic's Built with Opus 4.7 hackathon (selected top ~500 of 13,000+ applicants).

Stack

Claude Opus 4.7
Claude Managed Agents
TypeScript
Next.js
Supabase
Vercel
Computer Vision/OCR

Proof metrics

Hackathon Cohort
500 / ~13,000 (Top 3.8%)
Audit Cycle
2 weeks -> ~40 minutes
Device Shift
$2,000 scanner -> phone/glasses workflow

Evaluation scorecard

Production eval dimensions — how this system is judged before and after changes ship.

Label OCR accuracy
Eval harness in place

Test set of damaged, faded, and angled warehouse labels from real captures

Variance detection
Eval harness in place

Structured reconciliation cases against ERP master data

Session resume success
Eval harness in place

Long-horizon walk sessions with intentional interruption and retry

Cost per audit walk
Tracked in production

Selective high-effort reasoning only on variance paths; routine OCR stays cost-efficient

Problem

Warehouse audits often run for weeks with multiple field staff and heavy manual reconciliation in spreadsheets.

Damaged, faded, or angled labels fail frequently on traditional handheld scanners, creating repeated variance loops.

Audit workflows require sustained context across long sessions, not isolated one-shot API calls.

Solution

Built a stateful agent workflow for end-to-end warehouse walk sessions with resumable progress.

Used high-fidelity model vision to extract bin and label data from low-quality real-world images.

Applied selective high-effort reasoning only for variance classification while keeping routine OCR paths cost-efficient.

Added self-verification before report output to improve confidence for enterprise audit handoff.

Technical deep-dive

Stateful session design

Warehouse audits are not single-turn Q&A. A walk spans dozens of bins, intermittent connectivity, and operator pauses. WaybillAgent models the audit as a resumable session with explicit workflow states: capture, extract, lookup, reconcile, tag variance, and report.

Each state has clear entry/exit criteria and persistence so operators can stop mid-aisle and continue without losing context — a requirement production agents ignore in demos.

Selective reasoning and cost tradeoffs

Not every bin needs Opus-level reasoning. Routine label extraction runs on a cost-efficient vision path; high-effort reasoning activates only when variance classification or ambiguous reconciliation requires it.

This pattern keeps per-walk token spend predictable while preserving accuracy on the cases that actually block audit sign-off.

Self-verification before handoff

Before generating the variance report, the agent runs a self-verification pass: cross-check extracted codes against lookup results, flag low-confidence extractions, and surface items that need human review.

Enterprise audit handoff cannot tolerate silent failures — verification checkpoints are part of the workflow, not an afterthought.

Architecture

Architecture diagram (add image when ready)

Capture layer: phone/meta glasses image capture during aisle walkthrough.

Interpretation layer: Claude Opus vision + extraction pipelines for labels and bin codes.

Agent layer: managed multi-step session coordinating scan, lookup, reconciliation, and variance tagging.

Data layer: ERP/master-data reconciliation plus structured variance report output.

Outcomes

Proved a practical AI-first audit workflow that can run in real warehouse conditions in Nairobi.

Demonstrated operational viability for long-horizon agent sessions and resume/retry behavior.

Established a flagship product proof for forward-deployed AI engineering in East African enterprise environments.

Links & artifacts

Live DemoGitHubLinkedIn Profile

Related work

AIDC Barcode Toolkit

Open-source toolkit that packages real-world AIDC workflows so Claude Code can generate, validate, and reason about barcode and labeling tasks with domain-correct defaults.

Read case study

Outcome-Driven Agent Evaluation (Hive)

Exploration and extension of the Hive framework for outcome-driven agent development, focusing on how teams iterate when success is measured by business results rather than single-turn benchmarks.

Read case study

Discuss this work

Hiring or building something similar—reach out with context and constraints.

Email Joseph
Joseph Rwanda

Production AI Engineer | Remote · LLM agents & evals | Nairobi UTC+3

HomeWorkAboutResumeHireAI engineer in KenyaLinkedInGitHubVercelEmail

© 2026 Joseph Rwanda. All rights reserved.