ReAcTree: Hierarchical LLM Agent Trees
with Control Flow for Long-Horizon Task Planning

Jae-Woo Choi^{1, 2}, Hyungmin Kim², Hyobin Ong², Youngwoo Yoon^{1, 2},
Minsu Jang^{1, 2}, Dohyung Kim^{1, 2}, Jaehong Kim¹

¹ Electronics and Telecommunications Research Institute (ETRI)
² University of Science and Technology (UST)

Paper (arXiv) Code

Abstract

Recent advancements in large language models (LLMs) have enabled significant progress in decision-making and task planning for embodied autonomous agents. However, most existing methods struggle with complex, long-horizon tasks because they rely on a monolithic trajectory that entangles all past decisions and observations to solve the entire task in a single unified process. To address this limitation, we propose ReAcTree, a hierarchical task-planning method that decomposes a complex goal into manageable subgoals within a dynamically constructed agent tree. Each subgoal is handled by an LLM agent node capable of reasoning, acting, and further expanding the tree, while control flow nodes coordinate the execution strategies of agent nodes. In addition, we integrate two complementary memory systems: each agent node retrieves goal-specific, subgoal-level examples from episodic memory and shares environment-specific observations through working memory. Experiments on the WAH-NL and ALFRED show ReAcTree consistently outperforms strong task-planning baselines such as ReAct across diverse LLMs. Notably, on WAH-NL, ReAcTree achieves a 61% goal success rate with Qwen 2.5 72B, nearly doubling ReAct’s 31%.

Video Demonstration

Method Overview

**Figure 1. Overview of ReAcTree framework.** Given a high-level goal from a user, the root agent node generates a sub-plan and expands the tree by adding appropriate control flow nodes and child agent nodes. Then, the agent tree is executed, coordinated by control flow nodes, with agent nodes performing reasoning and actions in the environment.

Quantitative Results

ReAcTree consistently outperforms baselines across diverse LLMs. Notably, on the WAH-NL dataset, ReAcTree+WM achieves a 61% Goal Success Rate (GSR) with Qwen 2.5 72B, nearly doubling the performance of ReAct+WM.

1. Main Results on WAH-NL

Method	LLaMA 3.1 8B		LLaMA 3.1 70B		Qwen 2.5 7B		Qwen 2.5 72B		Mistral 7B		Gemma 2 9B		Phi-4-RP 14B
Method	GSR	SSR	GSR	SSR	GSR	SSR	GSR	SSR	GSR	SSR	GSR	SSR	GSR	SSR
ZSP [19]	1.00	13.03	0.00	14.42	0.00	8.98	0.00	14.22	0.00	11.65	1.00	13.87	17.90	0.00
Tree-Planner (N=25)	1.00	17.00	2.00	16.72	6.00	22.23	6.00	32.41	1.00	20.43	2.00	17.58	3.00	17.52
Tree-Planner (N=50)	4.00	21.85	4.00	23.43	8.00	28.10	9.00	36.03	6.00	23.63	3.00	23.30	4.00	20.40
ReAct [56]	8.00	34.25	30.00	57.05	10.00	31.82	26.00	51.38	6.00	28.18	9.00	37.20	33.00	48.13
ReAct + WM	16.00	42.65	33.00	63.15	13.00	39.73	31.00	54.05	9.00	31.95	11.00	39.93	33.00	51.28
ReAcTree (Ours)	21.00	51.98	32.00	60.58	18.00	50.20	48.00	75.13	11.00	37.92	26.00	60.43	49.00	67.47
ReAcTree + WM (Ours)	30.00	60.77	58.00	79.27	37.00	59.63	61.00	79.58	15.00	49.57	38.00	67.08	49.00	69.30

Table 1. Goal Success Rate (GSR) and Subgoal Success Rate (SSR) on WAH-NL dataset (%). Bold indicates the best result, and underline indicates the second best.

2. Main Results on ALFRED

Split	Method	LLaMA 3.1 70B	Qwen 2.5 72B
Valid-Seen	ReAct + WM	33.31	37.07
Valid-Seen	ReAcTree + WM	40.00	40.85
Valid-Unseen	ReAct + WM	32.40	39.10
Valid-Unseen	ReAcTree + WM	37.03	39.83

Table 2. Goal Success Rate (GSR) on ALFRED dataset (%). ReAcTree demonstrates strong generalization to unseen environments.

Qualitative Results

**Figure 3. Qualitative example on WAH-NL.** (Left) ReAcTree successfully completes the long-horizon task by effectively combining reasoning, acting, and planning capabilities. (Right) ReAct fails the task due to repetitive actions and lack of hierarchical planning.

Citation

@misc{choi2025reactree,
      title={ReAcTree: Hierarchical LLM Agent Trees with Control Flow for Long-Horizon Task Planning}, 
      author={Jae-Woo Choi and Hyungmin Kim and Hyobin Ong and Youngwoo Yoon and Minsu Jang and Dohyung Kim and Jaehong Kim},
      year={2025},
      eprint={2511.02424},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2511.02424},
}
}