Documentation Index
Fetch the complete documentation index at: https://playbook.pharmatools.ai/llms.txt
Use this file to discover all available pages before exploring further.
Core principle
An agent earns its keep when the task involves multiple steps that benefit from being orchestrated by AI — typically because the steps include verification, decision-making, or adaptation based on intermediate results. Single-step tasks rarely need an agent. Multi-step verification loops often do. The decision is never “agents are better” or “agents are dangerous”. It is: does the loop add value for this task, against the cost, latency, and audit complexity it creates?Why this matters now
Agentic AI moved from research curiosity to production-ready through 2024–2025. RefCheckr is one example: a verify-fix-recheck loop that closes itself on a claim. The pattern generalises — to systematic literature review conduct, compliance pre-screening with rewrite, evidence synthesis across studies, and more. Tool vendors and pharma innovation teams are increasingly pitching “agentic” solutions for systematic literature review, MLR triage, manuscript drafting, and evidence synthesis. Some are genuinely agentic; some are scripted pipelines marketed under the agent label. Medical writers are increasingly the people who have to assess these pitches — knowing what an agent should deliver, and what red flags to look for, is now part of the job.What “an agent” means here
An agent, in the sense relevant to medical writing, is an AI system that:- Plans a sequence of steps for a goal
- Executes each step (often by calling tools — search APIs, document retrievers, verification services)
- Observes the result of each step
- Adapts based on what it observed (re-tries, re-plans, escalates)
- Loops until a goal condition is met or a limit is hit
- A single-shot LLM prompt (one input, one output, no adaptation)
- A scripted pipeline (fixed steps, no decision-making between them)
- A human-orchestrated workflow (the human chooses what happens between steps)
Agent patterns worth knowing
Verify-fix loop (closed-loop)
Verify-fix loop (closed-loop)
Plan-execute-verify
Plan-execute-verify
Tool-using agent
Tool-using agent
Iterative refinement
Iterative refinement
Multi-agent / role-based
Multi-agent / role-based
When an agent earns its keep
| Signal that an agent fits | Why |
|---|---|
| The task has a natural verification gate | The loop has a place to close |
| Errors compound across steps | Multi-step thinking outperforms single-shot |
| The task is high-volume | Automation pays back the build cost |
| Intermediate steps would benefit from tool use | Agent can route to the right tool per step |
| You can audit each step | The agent’s value is auditable, not just visible in the final output |
| Application | Pattern | Status (mid-2026) |
|---|---|---|
| Claim verification (RefCheckr) | Verify-fix loop | Production |
| Compliance pre-screen with rewrite | Verify-fix loop | Emerging |
| Systematic literature review conduct | Plan-execute-verify with tool use | Emerging |
| Evidence synthesis across multiple studies | Plan-execute-verify | Emerging |
| Manuscript drafting with reference self-check | Iterative refinement | Emerging |
| Plain-language summary with source verification | Verify-fix loop | Emerging |
| Pharmacovigilance signal triage (high-risk) | Plan-execute-verify | Emerging — flag as high-risk under EU AI Act |
When an agent is the wrong tool
Single-step generation tasks
Single-step generation tasks
High-stakes, low-volume one-shots
High-stakes, low-volume one-shots
When verification is harder than the task
When verification is harder than the task
When you can't audit intermediate steps
When you can't audit intermediate steps
Time-sensitive interactive work
Time-sensitive interactive work
Failure modes specific to agents
The AI failure modes page covers the core risks. Agents add several patterns on top:- Cascading errors: One wrong intermediate step poisons every downstream step. The final output is fluent and confidently wrong.
- Reasoning chain fabrication: The agent’s “thinking” can include invented citations or fabricated checks that pass its own verification but fail real verification. Covered in Choosing Your Model.
- Tool misuse: Agents calling the wrong tool, or the right tool with wrong parameters. A claim sent to a literature search instead of a source-paper verifier looks like work but produces noise.
- Cost and latency runaway: Loops that don’t converge, or that keep retrying. Cap iterations explicitly.
- Prompt injection at any step: Each step that ingests external content (a paper, a web result, a user upload) is a potential injection point. Defend at each.
- Audit-trail loss: Multi-step agents are harder to log meaningfully than single-shot prompts. Plan logging upfront, not after the fact.
Practical assessment for medical writing teams
Before commissioning or using an agentic workflow:Is this actually agentic?
Is this actually agentic?
Where does the loop close?
Where does the loop close?
What does each step cost?
What does each step cost?
Can you audit it?
Can you audit it?
What is the risk tier?
What is the risk tier?
Common mistakes
Wrapping single-shot tasks in agent infrastructure
Wrapping single-shot tasks in agent infrastructure
Skipping the verification gate
Skipping the verification gate
Trusting agent self-reports
Trusting agent self-reports
Letting the loop run without a cap
Letting the loop run without a cap
Treating agents as a category instead of a pattern
Treating agents as a category instead of a pattern
How this connects to other playbook principles
- Closed-loop AI: The verify-fix loop is the foundational agentic pattern. RefCheckr is the worked example.
- Choosing Your Model: Agents typically use reasoning models for the planning and verification steps; standard models for generation. Mixing classes is the cost-efficient default.
- Source grounding: Agents do not exempt content from source grounding. Each generation step still needs verifiable source attribution.
- Review and accountability: The audit-trail requirements apply per agent step, not per overall agent run. Plan logging upfront.
- AI Regulation in Pharma: Some agentic uses (PV signal triage, clinical decision support) cross into high-risk territory under the EU AI Act. Map the use, not the technology.
- AI failure modes: Agents inherit single-shot failure modes and add new ones (cascading errors, tool misuse, audit-trail loss).
The bottom line
Agents are not better or worse than single-shot prompts; they are different tools for different tasks. The pattern earns its keep on multi-step verification, high-volume work, and tasks with natural verification gates. It is the wrong tool for single-step generation, low-volume one-shots, and anywhere the loop cannot close audibly. Specify the pattern, cap the iterations, log every step — and verify the agent the same way you would verify any other AI output.Last reviewed: 4 May 2026 · 8 min read