Researchers from Stanford University and Notbad AI developed Quiet-STaR, a technique that trains a language model (LM) to reason internally before generating an output.
When humans speak, we normally have an inner dialogue that shapes the words we eventually verbalize. The more we think before speaking, the better the quality of our spoken words.
In their paper, the researchers describe how they trained an LM (Mistral-7B) to learn how to imitate this process in a generalized way. Quiet-STaR is a progression of another technique called STaR, or Self-Taught Reasoner.
STaR is a method of training a model with a few examples of questions with explanations (rationales) for the answers. The model uses these chain-of-thought examples to try answering questions on its own, figuring out the rationales itself.
STaR evaluates whether or not the rationales it comes up with result in correct answers and refines its rationales.
As impressive as STaR is, its ability to reason is limited to the question-answering (QA) contexts during training. The goal of Quiet-STaR is to provide an LM with a generalized ability to learn how to reason or develop rationales, across a broader range of texts, not just QA datasets.
How does Quiet-STaR work?
Language models today are trained to reason either 1) generally, imitating online reasoning data or 2) narrowly, self-teaching on their own solutions to specific tasks
Can LMs teach themselves to reason generally?🌟Introducing Quiet-STaR, self-teaching via internal monologue!🧵 pic.twitter.com/WCSxLPZeCX
— Eric Zelikman (@ericzelikman) March 15, 2024
One of the key innovations in Quiet-STaR is that it generates rationales, or thoughts, in parallel, following all tokens in the text it is processing. It doesn’t output these chain-of-thought reasonings, hence the “Quiet” part of the algorithm’s name.
The algorithm processes the rationales through a “mixing head”. Each rationale is evaluated based on the accuracy of the next-token prediction it produced compared to the prediction made by the base model.
If the base model (without Quiet-STaR) delivers a better prediction, then the rationale wasn’t a good one. If the rationale results in a more accurate next-token prediction, then the algorithm knows it’s on to a good thing.
It then uses a reinforcement learning algorithm (REINFORCE) to learn which rationales help and which ones hinder the model’s performance. The result is that the model learns a generalized ability to think before predicting the next token.
Quiet-STaR results
The researchers tested the Quiet-STaR trained Mistral-7B model on the GSM8K math and CommonsenseQA common sense reasoning benchmarks. They found that Quiet-STaR improved perplexity and zero-shot direct reasoning abilities on both CommonsenseQA (36.3% to 47.2%) and GSM8K (5.9% to 10.9%) benchmarks.
While Mistral-7B’s math reasoning still isn’t great, Quiet-STaR delivered an improvement of almost 85% over the base model, and this was without any dataset-specific fine-tuning.”
Test results also showed that improvements in performance were directly related to how many tokens were allocated to the model’s internal thoughts. The more it thought before answering, the better the answer.
These improvements come at the cost of a substantial computing overhead. The inner monologue the model engages in during the thought process generates a lot of tokens.
Improvements in hardware will eventually make the additional overhead that comes with techniques like these less consequential.
The researchers conclude that future work on optimizing Quiet-STaR could help too. Dynamically predicting if a thought process is required, or how long it should be, could cut down on unnecessary thought tokens.
The results from training a small model like Mistral-7B with Quiet-STaR are promising. The researchers believe that “the same techniques applied to a better model would likely yield disproportionately better results.”
Ethical questions
Making a language model reason more like a human comes with some interesting issues and ethical questions.
The researchers note that “it is impossible to know that the reasoning expressed by the model in language accurately represents the internal processing of the model.” The rationales the model generates are natural language representations of its inner reasoning. Are they an accurate reflection?
They further note that “there are no safeguards against harmful or biased reasoning patterns if the model finds them useful.”
We may be happy with an AI model’s answer, but we might not like, or even understand, the thinking process that delivered it.
One of the paper’s lead authors, Eric Zelikman, just joined Elon Musk’s xAI this week. He may find that Grok is less concerned with these ethical questions and more excited by the prospect of AI advancement.