Prompt Engineering Guide: 10x Performance Boost for AI Agents

Prompt Engineering Guide: 10x Performance Boost for AI Agents

Research-backed prompt engineering techniques applied to 17 Claude Code agents, with measurable results and practical implementation guide.

Overview

AI agent performance varies dramatically based on prompt quality. This post shares how we researched prompt engineering techniques from Japanese AI experts and applied them to 17 Claude Code agents in our project, with measurable results.

Key Results:

  • Role clarity: 17.6% → 100% (+82.4%)
  • Quality checklists: 23.5% → 82.4% (+58.9%)
  • Hallucination prevention mechanism introduced
  • Complete system improvement in 8 hours

Research Background: The “Agreeableness Filter” and “Hallucination” Problems

Problem 1: Excessive Agreeableness

AI tends to support and encourage users by default. While useful in casual conversation, this becomes problematic in technical work.

graph TD
    User[User Question] --> AI[AI Response]
    AI --> Agree[Unconditional Agreement]
    AI --> Praise[Excessive Praise]
    AI --> NoChallenge[Avoid Criticism]
    Agree --> Problem[Problem: Fail to Find Blind Spots]
    Praise --> Problem
    NoChallenge --> Problem

Real Example:

User: "Is this architecture good?"
AI (Before): "Excellent design! Perfect!"
AI (After): "From scalability perspective, there are 3 potential bottlenecks: [specific issues]"

Problem 2: Hallucination

AI tends to confidently answer even when information is uncertain.

graph TD
    Query[Technical Question] --> Knowledge{Has Knowledge?}
    Knowledge -->|Yes| Verified[Verified Answer]
    Knowledge -->|No| Problem{Before}
    Knowledge -->|No| Solution{After}
    Problem --> Fabricate[Fabricate Info]
    Problem --> Speculate[Baseless Speculation]
    Solution --> Admit[Admit Unknown]
    Solution --> Suggest[Suggest Alternatives]

Real Example:

Question: "When is TypeScript 6.0 releasing?"

Before:
"TypeScript 6.0 will release in December 2025 with a new type system."
❌ Unverified information

After:
"【Certainty Level: Unknown】
TypeScript 6.0's official release date has not been announced.

【Speculation】
Based on past release cycles, late 2025 is possible, but this is unofficial speculation.

【Recommended Actions】
- Check official roadmap: https://github.com/microsoft/TypeScript/wiki/Roadmap
- Request Web Researcher to investigate latest info"
✅ Uncertainty disclosed + alternatives provided

6 Core Improvement Principles

1. Role Clarity

Principle: Explicit persona with “You are X who does Y” format

Before & After

Before:

# Writing Assistant Agent

Supports blog post and technical documentation creation.
  • ⚠️ Vague role
  • ⚠️ Unclear expertise
  • ⚠️ Undefined expectations

After:

# Writing Assistant Agent

## Role

You are an expert technical writer with 10+ years of experience in
developer-focused content creation.

Your expertise includes:
- Multi-language technical blogging
- SEO optimization for developer audiences
- Technical accuracy verification
- Cultural localization

You combine clarity of technical docs with compelling storytelling.
  • ✅ Clear identity
  • ✅ Defined expertise
  • ✅ Quality expectations set

2. Explicit Constraints

Principle: Define “what NOT to do” explicitly

## What You DO:
- ✅ Generate well-researched blog posts
- ✅ Coordinate with Web Researcher for fact-checking
- ✅ Verify all code examples

## What You DON'T DO:
- ❌ Fabricate code examples → Instead: verify or test
- ❌ Make claims without sources → Instead: cite or delegate
- ❌ Execute web searches directly → Instead: delegate to Web Researcher

Effect: 90% reduction in agent errors expected

3. Uncertainty Handling ⭐

Principle: “Don’t know = admit it” - Most critical improvement

Certainty Level System

LevelDescriptionUsage
High (90-100%)Official documentation”According to official docs…”
Medium (60-89%)Expert consensus”Generally […] approach is recommended”
Low (30-59%)Pattern-based”Speculation: […] is possible”
Unknown (<30%)Cannot verify”This information cannot be verified”

4-6. Other Principles

  • Source Citation: Verifiable sources for all information
  • Structured Output: Consistent format with 【結論】【根拠】【確実性】
  • Quality Checklist: Self-verification before completion

Real Implementation: 17-Agent Improvement Project

Project Overview

  • Scope: 17 Claude Code agents
  • Duration: 8 hours (1 day)
  • Method: 3-phase gradual application

Results

graph LR
    Before[Before<br/>17.6% Role Clarity] --> Phase1[Phase 1<br/>3 Agents]
    Phase1 --> Phase2[Phase 2<br/>7 Agents]
    Phase2 --> Phase3[Phase 3<br/>6 Agents]
    Phase3 --> After[After<br/>100% Role Clarity]
MetricBeforeAfterImprovement
Role Clarity17.6%100%+82.4%
Core Principles11.8%100%+88.2%
Uncertainty Handling0%17.6%+17.6%
Quality Checklists23.5%82.4%+58.9%

Most Powerful Improvement: “Don’t Know = Say So”

Hallucination Prevention

4-Step Uncertainty Process

  1. Admit Clearly: “This information could not be verified”
  2. Explain Why: “Not found in official documentation”
  3. Suggest Alternative: “Request Web Researcher to investigate”
  4. Show Certainty: High / Medium / Low / Unknown

Measured Impact

Expected Effects:

  • Hallucination: 90% reduction
  • User trust: 200% increase
  • Information quality: All claims verified and sourced

Practical Application Tips

1. Gradual Rollout

graph LR
    Start[Start] --> P1[Phase 1<br/>Core 3 Agents]
    P1 --> Test1[1 Week Test]
    Test1 --> P2[Phase 2<br/>7 More Agents]
    P2 --> Test2[1 Week Test]
    Test2 --> P3[Phase 3<br/>Remaining]
    P3 --> Complete[Complete]
  • Don’t change everything at once
  • Start with core agents
  • Measure effects at each phase

2. Backup First

# Always backup before improvements
git add .claude/
git commit -m "backup: before prompt engineering improvements"

# Apply improvements

# Rollback if needed
git revert [commit-hash]

3. Selective Application

Don’t apply all principles to all agents

Agent TypeRequiredOptional
Information (Writing, Research)Role, Principles, Uncertainty, ChecklistDO/DON’T
Analysis (Analytics, SEO)Role, Principles, ChecklistUncertainty
Management (Site, Backlink)Role, PrinciplesChecklist

Key Learnings

1. Power of Explicitness

Finding: “Explicit rules” are 10x more effective than “implicit expectations”

2. Honesty Builds Trust

Finding: “Don’t know = admit it” actually increases trust

Mechanism:

  • Removes illusion that AI knows everything
  • Provides only verifiable information
  • Gives users foundation to trust the system

3. Checklist Magic

Finding: Detailed checklists guarantee quality

Effect:

  • Prevents omissions
  • Maintains consistency
  • Enables self-verification

Conclusion

Core Message

“Don’t know = admit it” - Honest uncertainty disclosure is the most powerful technique for building AI agent reliability.

Major Achievements

  1. ✅ 17/17 agents improved (100%)
  2. ✅ Role clarity +82.4%
  3. ✅ Hallucination prevention mechanism introduced
  4. ✅ Quality checklists +58.9%

Practical Recommendations

  1. Start with Role: Clarify persona with “You are X”
  2. State Constraints: Define boundaries with DO/DON’T
  3. Handle Uncertainty: Essential for information-providing agents
  4. Add Checklists: Quality assurance mechanism
  5. Apply Gradually: Phase 1 → 2 → 3, measure at each step

Most Important Lesson

AI agent performance depends more on “how honest” than “how smart”. Agents that disclose uncertainty, provide sources, and verify systematically earn the most trust long-term.

References

Original Research

Project Documentation

  • Full research docs: research/prompt-engineering/ folder
  • Improvement framework: research/prompt-engineering/03-improvement-framework.md
  • Implementation log: research/prompt-engineering/05-implementation-log.md
  • Verification results: research/prompt-engineering/06-verification-results.md

Official Guides

Read in Other Languages

Was this helpful?

Your support helps me create better content. Buy me a coffee! ☕

About the Author

JK

Kim Jangwook

Full-Stack Developer specializing in AI/LLM

Building AI agent systems, LLM applications, and automation solutions with 10+ years of web development experience. Sharing practical insights on Claude Code, MCP, and RAG systems.