Language Models can Infer Action Semantics for Classical Planners from Environment Feedback
Authors: Wang Zhu, Ishika Singh, Robin Jia, Jesse Thomason
Year: 2024
Source:
https://arxiv.org/abs/2406.02791
TLDR:
The paper presents PSALM, a novel framework that integrates Large Language Models (LLMs) with classical planning to infer action semantics in new environments without manual annotation. PSALM leverages the commonsense reasoning abilities of LLMs to generate candidate trajectories and predict action semantics based on environment feedback, while using a symbolic solver to search for solutions that achieve the desired goal state. The method is shown to be effective and efficient across seven diverse environments, consistently inducing correct domain files with fewer execution steps and resets compared to baseline approaches. The paper also discusses limitations and future work, emphasizing the potential of LLMs in advancing domain induction capabilities in robotics.
Free Login To Access AI Capability
Free Access To ChatGPT
The paper introduces PSALM, a framework that combines the strengths of Large Language Models (LLMs) and classical planning to enable autonomous agents to infer action semantics in novel environments, thereby reducing the need for human expertise in specifying domain dynamics and improving the efficiency of task execution in robotics.
Free Access to ChatGPT
Abstract
Classical planning approaches guarantee finding a set of actions that can achieve a given goal state when possible, but require an expert to specify logical action semantics that govern the dynamics of the environment. Researchers have shown that Large Language Models (LLMs) can be used to directly infer planning steps based on commonsense knowledge and minimal domain information alone, but such plans often fail on execution. We bring together the strengths of classical planning and LLM commonsense inference to perform domain induction, learning and validating action pre- and post-conditions based on closed-loop interactions with the environment itself. We propose PSALM, which leverages LLM inference to heuristically complete partial plans emitted by a classical planner given partial domain knowledge, as well as to infer the semantic rules of the domain in a logical language based on environment feedback after execution. Our analysis on 7 environments shows that with just one expert-curated example plans, using LLMs as heuristic planners and rule predictors achieves lower environment execution steps and environment resets than random exploration while simultaneously recovering the underlying ground truth action semantics of the domain.
Method
The authors used a methodology that combines Large Language Models (LLMs) with classical planning techniques to infer action semantics in new environments. Their approach, named PSALM (Predicting Semantics of Actions with Language Models), leverages the commonsense reasoning capabilities of LLMs to generate candidate trajectories and predict action semantics based on feedback from the environment. A symbolic solver is then used to search for solutions that achieve the desired goal state, using the predicted action semantics. This iterative process involves maintaining a probabilistic memory of learned action semantics, which is updated based on interactions with the environment. The authors conducted experiments across seven different environments to validate the effectiveness and efficiency of their approach.
Main Finding
The authors discovered that their proposed framework, PSALM, which integrates Large Language Models (LLMs) with classical planning, is capable of effectively and efficiently inferring action semantics in new environments without the need for manual annotation by human experts. Through their experiments, they found that PSALM could consistently induce correct domain files across seven diverse environments, requiring substantially fewer execution steps and resets compared to multiple baseline methods. This integration of LLMs and symbolic solvers presents a promising avenue for advancing domain induction capabilities in robotics, enabling autonomous agents to explore and learn new tasks in novel environments.
Conclusion
The conclusion of the paper is that the proposed framework, PSALM, successfully demonstrates the integration of Large Language Models (LLMs) with classical planning to infer action semantics in new environments without manual annotation. The experiments conducted across seven environments show that PSALM can induce correct domain files with fewer execution steps and resets compared to baseline methods, highlighting the potential of LLMs in enhancing domain induction capabilities in robotics. The paper also acknowledges limitations and suggests future work to further explore and refine the methodology.
Keywords
Large Language Models, Classical Planning, Action Semantics, Domain Induction, PSALM, Environment Feedback, Robotics, Task Planning, PDDL, Symbolic Solvers, Commonsense Reasoning, Heuristic Planning, Rule Prediction, Domain Files, Execution Steps, Environment Resets, Ground Truth Action Semantics, Probabilistic Memory, Trajectory Sampling, Action Semantics Prediction, Memory Augmentation, Knowledge Acquisition, Planning Domain Definition Language, Answer Set Programming, Planning Domain Description Language, STRIPS Planner, API-based Programmatic Plan Generation, Program Synthesis, Language Model Prior, Error Messages, Domain Induction Problem, Planning Problem Formulation, PDDL Domain File
Powered By PopAi ChatPDF Feature
The Best AI PDF Reader