Over the last six months, we’ve seen a revolutionizing boom of AI-powered LLMs (Large Language Models) take center stage. But, is it always necessary for an AI product or service to be based on LLMs? According to a paper, new MIT self-learning language models are not based on LLMs and can outperform some of the other large AI systems that currently lead the industry.
A group of researchers at MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) developed a new way to approach AI language models.
It’s a groundbreaking achievement that emphasizes smaller language models and their ability to address problems of inefficiency, as well as privacy concerns that are connected to the development of large AI models based on textual data.
With the emergence of OpenAI’s ChatGPT based on language models GPT-3 and GPT-4, many companies joined the AI race, including Google Bard, and other generative AI systems that allow people to generate text, images, and even videos.
However, to generate the output of impeccable quality, these systems rely on a lot of data that is expensive to process computationally. Many of these systems import data for training via APIs, which comes with its own risks such as data leaks, and other privacy concerns.
According to a new paper titled Entailment as Robust Self-Learners currently published in preprinting online repository arXiv, researchers note that new MIT self-learning language models can address the problem of understanding certain language tasks that large language models have. They refer to this groundbreaking achievement as textual entailment.
The models are based on the concept that if there are two sentences – a premise and a hypothesis, in the case where a premise in the first sentence is true, the hypothesis is likely true as well.
In a statement published on the MIT CSAIL blog, one example of this structure would be that if “all cats have tails” the hypothesis “a tabby cat has a tail” is likely to be true. This approach leads to less bias in AI models, which makes new MIT self-learning language models outperform larger language models according to the statement.
“Our self-trained, 350M-parameter entailment models, without human-generated labels, outperform supervised language models with 137 to 175 billion parameters,” MIT CSAIL Postdoctoral associate Hongyin Luo, lead author said in a statement. “
He also added that this approach could be highly beneficial to current AI systems and reshape machine learning systems in a way that is more scalable, trustworthy, and cost-effective when working with language models.
New MIT self-learning language models are still limited
Even though new MIT self-learning language models promise a lot when it comes to solving binary classification problems, it is still limited to solving multi-class classification problems. That means that the textual entailment doesn’t work as well when the model is presented with multiple choices.
According to James Glass, MIT professor, and CSAIL Principal Investigator who also authored the paper, this research could shed light on efficient and effective methods to train LLMs to understand contextual entailment problems.
“While the field of LLMs is undergoing rapid and dramatic changes, this research shows that it is possible to produce relatively compact language models that perform very well on benchmark understanding tasks compared to their peers of roughly the same size, or even much larger language models, he said.”
This research is just the beginning of future AI technologies that could learn on their own and be more effective, sustainable, and focused on data privacy. The paper about new MIT self-learning language models will be presented in July at the Meeting of the Association for Computational Linguistics in Toronto. The project is also backed by the Hong Kong Innovation AI program.