Reflection loop

A pattern where the model critiques or scores its own output (or an agent’s step) before finalizing or retrying.

When to use it

  • Quality is inconsistent and human review is expensive.
  • Tasks have objective checks (format, policy, facts) you can automate.
  • You need graceful recovery when a step fails.

PM decision impact

Reflection improves reliability but adds cost and latency. PMs decide which checks are worth an extra model call and what auto-fixes are allowed. It also affects user trust—surfacing self-checks can reassure or annoy users depending on framing.

How to do it in 2026

Define lightweight checklists (policy, format, key facts). Run reflection only on risky steps and cap retries. In 2026, pair reflection with small guard models to avoid full LLM passes when a cheap classifier suffices.

Example

A contract clause generator runs a reflection check for missing parties, dates, and PII leaks. It auto-fixes minor issues, raising pass rate from 78% to 91% with a 12% cost increase and 250 ms extra latency—within the 1.8 s SLO.

Common mistakes

  • Reflecting on every step indiscriminately, wasting budget.
  • Allowing the model to override hard safety rules during self-fix.
  • Not logging reflection outcomes, so regressions slip past.

Related terms

Learn it in CraftUp

Last updated: February 2, 2026