Solving multi-step problems with LLMs via deliberate planning and exploration…
As large language models (LLMs) first started to gain in popularity, they were criticized for their shortcomings in solving complex, reasoning-based problems. Although scaling up these models (i.e., more parameters and more data) provided a near-uniform performance improvement across tasks, we saw virtually no boost in performance on reasoning-based tasks with modern LLMs. This changed with the proposal of advanced prompting techniques, such as chain of thought prompting [2] and self-consistency [3]. Such methods showed us that LLMs are more than capable of reasoning and solving complex, multi-step problems. They just have to be properly prompted to fully leverage these abilities.
“It is perhaps surprising that underlying all this progress is still the original autoregressive mechanism for generating text, which makes token-level decisions one by one and in a left-to-right fashion.” — from [1]
Even if proper prompting can enable LLMs to solve complex problems, these techniques are lacking. Namely, we typically i) provide a prompt to the LLM and ii) expect the model to use next token prediction to generate a full solution. Certain approaches may generate solutions in a step-by-step fashion (e.g., least-to-most prompting [8]), but the LLM still follows a single reasoning path instead of exploring several potential solutions. For complex problems in which initial decisions can completely derail a solution, such an approach will fall short, which is an issue given that LLMs are now commonly used as general purpose problem solvers in a variety of practical applications. Put simply, we need a prompting approach that performs more deliberate planning and exploration when solving problems.
In [1], authors propose such an approach — called Tree of Thoughts prompting — that solves problems by explicitly decomposing them into a series of thoughts, or intermediate steps. Similar to chain of thoughts prompting, tree of thoughts prompting generates a solution that is simply a sequence of individual thoughts. However, this approach goes further by allowing multiple reasoning path to be considered at once — forming a tree of potential thoughts or reasoning paths — and exploring this…