Skip to main content

Documentation Index

Fetch the complete documentation index at: https://playbook.pharmatools.ai/llms.txt

Use this file to discover all available pages before exploring further.

TermDefinition
ABPIAssociation of the British Pharmaceutical Industry. UK pharmaceutical industry body whose Code of Practice governs promotional materials.
AEAdverse event. Any undesirable medical occurrence in a patient during a clinical trial, whether or not it is related to the treatment.
AI agentA program that uses a language model to plan and carry out multi-step tasks by calling tools, retrieving information, or acting on behalf of a user. Differs from a simple chatbot in that it takes actions rather than only generating text.
Algorithmic biasSystematic errors in AI output that disadvantage particular groups, usually because of skews in training data or design choices. In medical contexts: under-representation of women, ethnic minorities, paediatric, or older populations leads to outputs that work less well for those groups. A quality and equity consideration in any patient-facing AI use.
ChatbotAn AI that holds a turn-by-turn conversation with a user. Distinct from an AI agent: a chatbot generates responses, while an agent plans, calls tools, observes results, and adapts. Most consumer-facing AI products are chatbots; some are agentic underneath.
CIConfidence interval. A range of values within which the true treatment effect is expected to fall (e.g., 95% CI: 0.52–0.84).
Closed-loop AIAn AI system that acts, observes the result, and adjusts — repeating that cycle without a human in each step. Contrasts with open-loop use, where the model produces a single output and a human decides what to do next. RefCheckr is one example: it verifies a claim against the source paper, rewrites it if the data don’t match, then re-verifies and checks ABPI compliance, looping again if anything fails.
Computer useA model that can see the screen, move a cursor, and type — controlling a computer or browser the way a person does. Used for agents that automate desktop tasks. Higher risk than text generation because actions are taken in the real world, not just suggested.
Context windowThe amount of text (input plus output) a language model can process in a single call, measured in tokens. A 200,000-token window holds roughly 150,000 words. Sets the limit on how much source material can be provided in one prompt.
CSRClinical study report. The comprehensive report of a clinical trial, including methods, results, and analysis. Primary source document for many medical writing deliverables.
DeepfakeAI-generated audio, video, or images depicting real people doing or saying things they didn’t. Increasingly relevant to medical writing because image-manipulation screens used by journals (see AI in Peer Review) flag both deliberate manipulation and AI-generated figures presented as authentic data.
Digital twinA virtual model of a real-world system (a patient, an organ, a manufacturing process) used to simulate behaviour and predict outcomes. In pharma, increasingly applied to in silico trial control arms, dose-response modelling, and patient-response prediction.
DOIDigital object identifier. A unique, persistent identifier for published articles (e.g., 10.1000/xyz123).
EmbeddingA numerical representation of text that captures semantic meaning, used to compare how similar two pieces of text are. Underpins retrieval in RAG systems and semantic search.
EMAEuropean Medicines Agency. The EU regulatory authority responsible for the scientific evaluation and approval of medicines.
EU CTREU Clinical Trials Regulation. Requires sponsors to publish lay-friendly summaries of trial results.
EudraCTEuropean Clinical Trials Database. The EU register for clinical trials, referenced in regulatory submissions and PLS documents.
Fair balanceRegulatory requirement that promotional materials present both benefits and risks of a treatment in a balanced manner.
FDAFood and Drug Administration. The US regulatory authority responsible for approving drugs, biologics, and medical devices.
Fine-tuningAdditional training that adapts a general-purpose language model to a specific domain, style, or task using a smaller dataset of examples.
Foundation modelA large AI model trained on broad data that serves as the base for downstream applications and tools. Claude, GPT, Gemini, and Llama are foundation models. Different from a frontier model: any foundation model can be the frontier model at a given time, but most are not.
Frontier modelShorthand for the most capable models available at any given time (e.g., Claude Opus 4.X, GPT-5). The label moves as new releases arrive, so today’s frontier model is a generation behind within a year.
Generative AIAI that produces new content — text, images, audio, video, code — rather than only classifying or predicting. Most of the AI covered in this playbook is generative AI.
GroundingAnchoring an AI’s output to a specific source rather than letting the model rely on what it learned during training. Source grounding is the medical-writing application: every claim must trace to provided evidence. See Source Grounding.
GuardrailsExplicit rules that constrain what a model is allowed to do or say. In medical writing, examples include “no new numbers”, “no language stronger than the source”, or “do not infer beyond the data”. Usually enforced in the prompt or by a wrapping system, not by the model itself.
HallucinationAI output that reads plausibly but is factually wrong or invented — a fabricated reference, a misremembered statistic, or a claim with no basis in the source. A primary risk in medical writing.
HarnessThe software wrapper around a language model that turns it into a usable product — handling tool calls, file access, memory, retries, and the loop between the model and the outside world. Claude Code, Cursor, and ChatGPT are all harnesses around their underlying models. The capabilities of an AI tool depend as much on the harness as on the model.
HCPHealthcare professional. Includes physicians, nurses, pharmacists, and other qualified practitioners.
HEORHealth economics and outcomes research. The discipline focused on the economic value and real-world outcomes of healthcare interventions.
Human-in-the-loop (HITL)A workflow design where a human reviews, edits, or approves AI output before it is used. The default for medical writing content destined for external use. See Human-in-the-Loop.
HRHazard ratio. A measure of relative risk over time, commonly used in survival analysis (e.g., HR 0.67 means 33% risk reduction).
IBInvestigator’s brochure. A regulatory document summarising the clinical and non-clinical data on a compound, provided to investigators conducting clinical trials.
ICHInternational Council for Harmonisation. Sets technical guidelines for pharmaceutical development and regulatory submissions (e.g., ICH E6 for GCP, ICH M4 for CTD structure).
IFPMAInternational Federation of Pharmaceutical Manufacturers & Associations. Global industry body with a Code of Practice for marketing.
IMRADIntroduction, Methods, Results, and Discussion. Standard structure for scientific manuscripts.
ITTIntention-to-treat. Analysis that includes all randomised participants regardless of whether they completed the study. The primary analysis in most RCTs.
KOLKey opinion leader. A recognised expert in a therapeutic area, often engaged for advisory boards and speaker programmes.
LLMLarge language model. A statistical model trained on large volumes of text that generates language by predicting the next token. GPT, Claude, and Gemini are LLMs.
MCPModel Context Protocol. An open standard created by Anthropic that lets AI assistants call external tools securely. PubCrawl is an MCP server.
MedDRAMedical Dictionary for Regulatory Activities. The standardised medical terminology used in regulatory reporting of adverse events.
MeSHMedical Subject Headings. The controlled vocabulary used by PubMed/MEDLINE to index biomedical literature.
mITTModified intention-to-treat. A variation of ITT that excludes certain participants (e.g., those who never received treatment). Definition varies by study.
MLRMedical, Legal, Regulatory review. The formal review process for promotional and medical content before external use.
MOAMechanism of action. How a drug produces its pharmacological effect.
MSLMedical science liaison. A field-based medical affairs professional who engages with HCPs on scientific and clinical matters.
MultimodalA model or system that can process more than one type of input — typically text plus images, audio, or video. In medical writing, useful for reading figures, tables, or scanned documents.
NMANetwork meta-analysis. A statistical method for comparing multiple treatments indirectly through a network of studies.
OCROptical character recognition. Converts scanned document images to searchable text. Poor OCR quality can cause errors in automated reference checking.
Open-loop AIAn AI system that produces a single output and stops — a human decides what to do next. Contrasts with closed-loop AI. Most everyday LLM use is open-loop: ask, read, decide.
OROdds ratio. A measure of association between an exposure and an outcome (e.g., OR 1.5 means 50% higher odds).
ORRObjective response rate (or overall response rate). The proportion of patients with a defined reduction in tumour size or disease activity.
OSOverall survival. Time from randomisation to death from any cause. A primary endpoint in many oncology trials.
PFSProgression-free survival. Time from randomisation to disease progression or death. A common endpoint in oncology.
PIPrescribing information. The approved product labelling. See also SmPC (EU) and USPI (US).
PICOPopulation, Intervention, Comparator, Outcomes. A framework for structuring clinical research questions.
PLSPlain language summary. A lay-friendly summary of clinical trial results, increasingly required by regulation.
PMCPAPrescription Medicines Code of Practice Authority. The UK body that administers the ABPI Code of Practice.
PMIDPubMed identifier. A unique number assigned to each article indexed in PubMed.
PPPer-protocol. Analysis that includes only participants who completed the study as planned, without major protocol deviations.
PRISMAPreferred Reporting Items for Systematic Reviews and Meta-Analyses. A reporting guideline for systematic reviews.
PromptThe instruction given to a language model. The structure, context, and constraints of a prompt strongly influence output quality.
Prompt engineeringThe practice of designing prompts to produce reliable, useful output from a language model. Includes setting role, constraints, examples, and output format.
Prompt injectionA safety risk where instructions hidden in external content (a PDF, a web page, a user upload) cause the model to do something it wasn’t asked to do. Especially relevant for tools that ingest documents or browse the web on the user’s behalf.
QALYQuality-adjusted life year. A measure of disease burden used in health economics, combining quantity and quality of life.
QCQuality control. The review and verification process applied to deliverables before submission or publication.
RAGRetrieval-augmented generation. A technique where a language model is given relevant documents to draw on when answering, rather than relying only on what it learned during training. Central to source-grounded medical writing systems. MedCheckr uses RAG to evaluate content against the ABPI Code of Practice.
RCTRandomised controlled trial. A study design where participants are randomly assigned to treatment or control groups.
Reasoning modelA class of language model that explicitly “thinks” before answering — generating intermediate reasoning steps that improve performance on complex tasks. Examples: Claude with extended thinking, OpenAI’s o-series. Useful for multi-step verification or planning, less necessary for simple drafting.
SAESerious adverse event. An AE that results in death, hospitalisation, disability, or is otherwise medically significant.
SAPStatistical analysis plan. The document specifying the planned statistical analyses for a clinical trial, written before database lock.
SkillA self-contained, packaged capability an AI assistant can install and call on demand (e.g., Claude Skills, OpenClaw ClawHub skills). Skills bundle prompts, tools, and instructions for a specific job. Patiently AI is published as a skill.
SLRSystematic literature review. A structured, reproducible search and analysis of published evidence, following a predefined protocol.
SmPCSummary of Product Characteristics. The EU equivalent of prescribing information. Approved by the EMA or national authority.
SOPStandard operating procedure. Documented internal processes that define how work is conducted within an organisation.
System promptThe persistent instructions given to a model that define its role, constraints, and behaviour, separate from the user’s turn-by-turn input. Most of the “personality” and capabilities of a tool live in the system prompt, not the model.
TFLTables, figures, and listings. The statistical outputs from a clinical trial used as source material for CSR drafting.
TokenThe unit of text a language model processes, roughly ¾ of a word in English. Input and output length — and API costs — are measured in tokens.
Tool useThe mechanism by which a language model calls external functions — a search API, a database query, a file operation, a custom service. Also called function calling. Tool use turns a chatbot into an agent; MCP is one standard for declaring available tools.
USPIUnited States Prescribing Information. The FDA-approved product labelling.
Zero-shot / few-shotPrompting styles. Zero-shot gives the model a task with no examples; few-shot includes one or more worked examples. Few-shot often improves accuracy on structured tasks.