AI Made Simple
Posts
Enhancing Planning Abilities in LLM-Based Agents: The AGENTGEN Approach

Enhancing Planning Abilities in LLM-Based Agents: The AGENTGEN Approach

Hassan Dhia
August 04, 2024

The paper "AGENTGEN: Enhancing Planning Abilities for Large Language Model based Agent via Environment and Task Generation" by Mengkang Hu and colleagues tackles a significant challenge in the realm of AI: improving the planning capabilities of Large Language Model (LLM)-based agents. The core idea is to automate the creation of diverse environments and tasks, which can then be used to train these agents more effectively. This approach aims to make LLMs better at planning and executing actions across various scenarios.

The Core Idea

The methodology of AGENTGEN is a two-stage process. First, LLMs generate diverse environments using an inspiration corpus composed of domain-specific text segments. This corpus provides the context needed for the LLMs to create environment specifications, which are then converted into code. Second, conditioned on these generated environments, LLMs create planning tasks with varying difficulty levels using a bidirectional evolution method (BI-EVOL). This method evolves tasks from both easier and harder directions, ensuring a smooth difficulty curve.

Key Features

One of the standout features of AGENTGEN is its automated environment and task generation. Unlike previous studies that rely on manually designed environments and tasks, AGENTGEN automates this process, significantly increasing the diversity and quantity of training data. The use of a diverse text corpus as context for generating environments ensures a wide range of scenarios and domains. Additionally, the bidirectional evolution method creates a task set with a smoother difficulty curve, enhancing the learning process.

Experimental Setup and Results

The experimental setup involved synthesizing environments and planning tasks using the Planning Domain Definition Language (PDDL) and constructing a dataset with 592 environments and 7,246 high-quality trajectories. The trained models were evaluated on both in-domain (PDDL-based) and out-of-domain (other programming languages) tasks. The results were impressive: AGENTGEN significantly improved the planning abilities of LLMs. For instance, the AGENTGEN instruction-tuned Llama-3 8B model outperformed GPT-3.5 in overall performance and even surpassed GPT-4 in certain tasks. The model also demonstrated substantial improvements in success rates on out-of-domain tasks like Alfworld and BabyAI.

Advantages and Limitations

Advantages:

Automated Generation: Reduces the labor-intensive nature of manual design.
Smooth Difficulty Curve: Enhances the learning process through the bidirectional evolution method.
Significant Improvements: Demonstrated substantial improvements in planning abilities across multiple tasks and domains.

Limitations:

Reliance on Inspiration Corpus: The approach relies heavily on the quality and diversity of the inspiration corpus.
Variable Effectiveness: The effectiveness of the generated environments and tasks may vary depending on the specific application or domain.

Conclusion

AGENTGEN represents a significant advancement in enhancing the planning abilities of LLM-based agents by automating the generation of diverse environments and tasks. The use of an inspiration corpus and the bidirectional evolution method are key innovations that contribute to its effectiveness. The experimental results validate the approach, showing substantial improvements over existing models like GPT-3.5 and even surpassing GPT-4 in certain scenarios. However, the approach's reliance on the inspiration corpus highlights a potential area for further refinement.

[READ FULL PAPER]