
Chain of Continuous Thought Promises Mighty Boost for LLMs and Generative AI by Blowing up the Fixation on Tokens
Recently, researchers have introduced a groundbreaking concept called “Chain of Continuous Thought” (Coconut), which has the potential to revolutionize the field of Large Language Models (LLMs) and generative AI. By abandoning the conventional token-based reasoning approach, this innovative paradigm opens doors for more advanced and efficient reasoning patterns.
According to recent findings, LLMs are restricted in their ability to reason due to being confined within a “language space”. This limitation can hinder the model’s capacity to tackle complex problems that require sophisticated planning and backtracking. In contrast, Coconut liberates LLMs from this constraint by allowing them to operate directly in a continuous latent space.
The Coconut method involves using the last hidden state of an LLM as a representation of the reasoning state, which is then fed back into the model as input embedding in the same continuous space. This approach enables the model to explore multiple alternative next steps and perform breadth-first searches (BFS) to find solutions, rather than prematurely committing to a single path.
Early experiments demonstrate that Coconut can effectively augment LLMs on various reasoning tasks, resulting in improved performance compared to traditional chain-of-thought (CoT) approaches. Specifically, Coconut outperforms CoT on logical reasoning tasks that necessitate significant backtracking during planning. Additionally, it reduces the number of thinking tokens required during inference.
It is crucial to recognize that these findings do not diminish the importance of LLMs and CoT-based methods. Instead, they offer a novel avenue for expanding the capabilities of AI systems. By shifting our focus away from token-based reasoning and embracing this continuous latent space paradigm, we may unlock previously unexplored potential in the realm of generative AI.
The implications of Coconut are far-reaching and have significant potential to reshape the future of AI research.
Source: www.forbes.com