ReAct (Reasoning + Acting)
Interleaving reasoning and action in agents.
Definition
ReAct is a paradigm where the model alternates reasoning (what to do next, why) and acting (tool calls). The observation from the environment feeds back into the next reasoning step, forming a loop until the task is done. This interleaving reduces errors caused by blind or repetitive tool use, because each action is preceded by an explicit rationale.
The core contribution of the ReAct paper is showing that combining reasoning traces and action steps in a single LLM call outperforms either alone: pure reasoning (CoT) misses factual grounding, and pure action (tool calling without thought) is error-prone and hard to debug. By making thoughts visible, ReAct also produces interpretable agent traces that humans can inspect and correct.
It is the standard pattern for agents that use tools. Often combined with chain-of-thought (reasoning inside the thought step) and with RDD when retrieved specifications should guide each decision.
How it works
Thought–action–observation loop
Agent decision flow
Prompt format is Thought → Action → Observation → Thought → … → Final Answer. The user gives a task; the agent produces a thought (reasoning about what to do), then an action (e.g. tool call). The environment/tools return an observation, which is appended to the context for the next thought. The model decides when to call tools and when to conclude. Frameworks like LangChain and LlamaIndex implement ReAct-style agents with tool registration and message handling.
When to use / When NOT to use
| Scenario | Use ReAct | Don't use ReAct |
|---|---|---|
| Agent using multiple tools (search, calculator, API) | Yes — thought before action reduces tool misuse | No — if only one tool is needed, simpler function calling suffices |
| Debuggable agent behavior required | Yes — thought traces are inspectable and loggable | No — for black-box pipelines where traces aren't needed |
| Multi-step research with evolving context | Yes — each observation informs the next thought | No — single-shot retrieval + generation is faster and cheaper |
| High-reliability tasks (e.g. code execution) | Yes — reasoning before acting catches likely errors | No — for simple CRUD tasks with no ambiguity |
| Very low latency requirements | No — thought generation adds tokens per step | Yes — direct function calling is faster when reasoning is unnecessary |
Comparisons
| Pattern | Has explicit thought | Has tool use | Loop | Best for |
|---|---|---|---|---|
| CoT | Yes | No | No | Static reasoning tasks |
| ReAct | Yes | Yes | Yes | Tool-using agents |
| Function calling (no thought) | No | Yes | No | Simple, deterministic tool invocations |
| RDD | Yes (spec-guided) | Yes | Yes | Compliance and spec-driven agents |
Pros and cons
| Pros | Cons |
|---|---|
| Reduces blind or repetitive tool calls | Extra tokens per step (thought overhead) |
| Produces interpretable, debuggable traces | Loop can run too long if stopping criteria are weak |
| Works well with LangChain/LlamaIndex out of the box | Requires well-defined tool schemas and error handling |
| Naturally handles multi-step tasks | Thought quality depends on the underlying model |
Code examples
from langchain_openai import ChatOpenAI
from langchain.agents import create_react_agent, AgentExecutor
from langchain_community.tools import DuckDuckGoSearchRun
from langchain import hub
# Load a pre-built ReAct prompt template
prompt = hub.pull("hwchase17/react")
# Define tools
tools = [DuckDuckGoSearchRun()]
# Create ReAct agent
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
agent = create_react_agent(llm=llm, tools=tools, prompt=prompt)
executor = AgentExecutor(agent=agent, tools=tools, verbose=True, max_iterations=5)
# Run — the agent will produce Thought/Action/Observation traces
result = executor.invoke({"input": "What is the current population of Tokyo?"})
print(result["output"])Practical resources
- ReAct: Synergizing Reasoning and Acting in LLMs (Yao et al.) — Original ReAct paper with benchmarks on HotpotQA, Fever, and ALFWorld
- LangChain – ReAct agent — ReAct-style agents with tool registration in LangChain
- Anthropic – Tool use guide — Claude's native tool use, which follows ReAct-style thought–act patterns
Sources
- ReAct: Synergizing Reasoning and Acting in Language Models (Yao et al., 2022) — Seminal paper introducing the interleaved thought–action–observation loop and benchmarking it on HotpotQA, FEVER, and ALFWorld.
- Chain-of-Thought Prompting Elicits Reasoning in Large Language Models (Wei et al., 2022) — Foundational work on explicit reasoning steps that ReAct builds upon.
- Toolformer: Language Models Can Teach Themselves to Use Tools (Schick et al., 2023) — Complementary work on self-supervised tool integration that underpins ReAct implementations.
- Reflexion: Language Agents with Verbal Reinforcement Learning (Shinn et al., 2023) — Extends ReAct with episode-level verbal reflection for iterative self-improvement.