17 April 2026

P2P2 and Reflexion — Planning and Self-Correction

P2P2 🔴

Need: A complex task requiring planning before execution — an agent-level approach.

Explanation: Plan → Proof (verify) → Plan (refine) → Produce (execute). Two-phase planning with validation.

Prompt format:

“Step 1: Plan. Step 2: Verify the plan’s completeness. Step 3: Refine. Step 4: Execute.”

QA usage example

“Plan the migration from JUnit 4→5 (300 tests). Verify: custom runners, @Rule usage. Refine the sequence. Execute the first module.”

When to use P2P2?

For large, high-risk tasks — migrations, framework builds, refactoring. P2P2 forces plan validation BEFORE you start implementation. Double planning catches mistakes that single planning misses.

Reflexion 🔴

Need: The model should learn from its own mistakes through iterations.

Explanation: A loop: Answer → Feedback → Improve. The model evaluates its own response and corrects it independently.

Prompt format:

“Write a solution. Critically evaluate it. Improve it based on your own evaluation.”

QA usage example

“Write a Selenium test for the login flow. Evaluate: does it handle timeout, retry, flakiness? Improve based on the evaluation.”

When to use Reflexion?

When you need higher quality from a single prompt — generating test code, writing documentation, analysis. One additional self-evaluation iteration can catch 80% of the problems.

In the next post: Self-Consistency, Meta Prompting, Least-to-Most, and PAL — the remaining advanced frameworks.