MIT researchers have uncovered how leveraging the concept of symmetry within datasets can reduce the volume of data needed for training models.
This discovery, documented in a study retrievable via ArXiv by Behrooz Tahmasebi, an MIT Ph.D. student, and his advisor, Stefanie Jegelka, an associate professor at MIT, is rooted in a mathematical insight from a century-old law known as Weyl’s law.
Weyl’s law, originally formulated by German mathematician Hermann Weyl over 110 years ago, was designed to measure the complexity of spectral information, such as the vibrations of musical instruments.
Inspired by this law while studying differential equations, Tahmasebi saw its potential to reduce the complexity of data input into neural networks. By understanding symmetries inherent in a dataset, a machine learning model could be made more efficient and faster without adding more data numerically.
Tahmasebi and Jegelka’s paper explains how exploiting symmetries, or “invariances,” within datasets can simplify machine learning tasks, in turn requiring less training data.
That sounds very complex, but the principle is relatively straightforward. For example, think of the letter ‘X’ — whether you rotate it or flip it, it still looks like an ‘X.’ In machine learning, when models understand this idea, they can learn more efficiently. They realize that even if an image of a cat is turned upside down or mirrored, it still shows a cat.
This helps the model make better use of its data, learning from every example in multiple ways and reducing the need for a huge amount of data to achieve accurate results.
However, this study goes deeper than symmetry in a conventional sense. Kernel Ridge Regression (KRR) invariances encompass symmetrical transformations like rotations, reflections, and other data characteristics that remain unchanged under specific operations.
“To the best of my knowledge, this is the first time Weyl’s law has been used to determine how machine learning can be enhanced by symmetry,” Tahmasebi stated.
The research was initially presented at the December 2023 Neural Information Processing Systems conference.
This is particularly crucial in fields like computational chemistry and cosmology, where quality data is limited. Sparse data is common in fields where datasets are exceptionally large, but actually, useful data within the sets is very limited.
For instance, in the vastness of space, you might find one tiny speck of useful data among an unfathomably large sea of nothingness — so you have to make that speck of data work — and symmetry is a helpful tool in achieving that.
Soledad Villar, an applied mathematician at Johns Hopkins University, noted of the study, “Models that satisfy the symmetries of the problem are not only correct but also can produce predictions with smaller errors, using a small amount of training points.”
Benefits and results
The researchers identified two types of improvements from utilizing symmetries: a linear boost, where the efficiency increases in proportion to the symmetry, and an exponential gain, which offers a disproportionately large benefit when dealing with symmetries that span multiple dimensions.
“This is a new contribution that is basically telling us that symmetries of higher dimension are more important because they can give us an exponential gain,” Tahmasebi elaborated.
Let’s break this down further:
- Using symmetries to enhance data: By recognizing patterns or symmetries in the data (like how an object looks the same even when rotated or flipped), a machine learning model can learn as if it has more data than it actually does. This approach boosts the model’s efficiency, allowing it to learn more from less.
- Simplifying the learning task: Their second finding is about making the model’s functions easier by focusing on these symmetries. Since the model learns to ignore changes that don’t matter (like the position or orientation of an object), it has to deal with less complicated information. This means the model can achieve good results with fewer examples, speeding up the learning process and improving performance.
Haggai Maron, a computer scientist at Technion and NVIDIA, praised the work for its novel perspective, telling MIT, “This theoretical contribution lends mathematical support to the emerging subfield of ‘Geometric Deep Learning.”
The researchers directly highlight the potential impact in computational chemistry, where the principles from their study could accelerate drug discovery processes, for example.
By exploiting symmetries in molecular structures, machine learning models can predict interactions and properties with fewer data points, making screening potential drug compounds faster and more efficient.
Symmetries could also assist in analyzing cosmic phenomena, where datasets are extremely large yet sparsely populated by useful data.
Examples could include leveraging symmetries for studying cosmic microwave background radiation or the structure of galaxies to extract more insights from limited data.