An AI system developed using comprehensive personal data from Denmark has shown remarkable accuracy in predicting an individual’s risk of death.
This AI, documented in a study published in Nature Computational Science, was created by Sune Lehmann Jørgensen and his team from the Technical University of Denmark.
They analyzed a huge dataset encapsulating numerous factors from across the Danish population, including education, medical visits, diagnoses, income, and occupation data, from six million individuals spanning from 2008 to 2020.
This data was then transformed into a format suitable for training a large language model (LLM). The team’s Life2vec model reviews a person’s life events and forecasts probable future outcomes, similar to how an LLM processes language.
To test Life2vec, the team reserved the final four years of data and focused on individuals aged 35 to 65, half of whom passed away between 2016 and 2020.
Life2vec’s predictions on who was likely to survive or not outperformed any existing AI models and actuarial life tables (used by the insurance industry) by approximately 11%. It was also used to predict personality outcomes, showcasing the model’s ability to map large-scale societal inputs to outputs at the individual level.
Jørgensen envisions this model as a tool for early detection of health and social issues, potentially aiding governments in reducing health and social inequalities. It uncovers relationships between mortality and economic, labor, income level, and birth year, providing another avenue for exploring the impact of these types of macro-demographic factors on an individual’s health.
However, Jørgensen cautions against potential business misuse, particularly in the insurance industry, where it could disrupt the fundamental principle of shared risk.
If insurers used AI to determine when a specific individual is at greater risk of death, that would open up a complex ethical debate. That parallels somewhat to AI’s other predictive uses, such as predictive policing programs, which have singled out individuals as potential ‘suspects’ prior to them committing a possible crime.
Jørgensen said of this, “Clearly, our model should not be used by an insurance company, because the whole idea of insurance is that, by sharing the lack of knowledge of who is going to be the unlucky person struck by some incident, or death, or losing your backpack, we can kind of share this this burden.”
More about the study
Here’s some more information about the study’s aims, novel approach, and how it worked:
- Data collection and transformation: The research team gathered an extensive dataset covering the entire population of Denmark, spanning from 2008 to 2016 and including about six million inhabitants. This dataset incorporated detailed daily records of various life events, including health incidents, educational level, employment status, income levels, residency, and working hours.
- Creating a synthetic language for life events: The researchers converted these life events into a format resembling language, enabling the use of natural language processing techniques. They treated each life event as a ‘sentence’ consisting of ‘concept tokens,’ which included detailed information like the type of event, income level, and job type.
- Development of the Life2vec model: Using transformer architecture, the team developed the model. This model could capture complex relationships between different life events, similar to how LLMs understand relationships between words.
- Predictive analysis and testing: Life2vec was tested for its ability to predict various outcomes, notably early mortality and personality traits. For mortality prediction, the model evaluated the likelihood of individuals surviving four years post-2016. It outperformed traditional models in doing so.
- Understanding and interpreting the model: The researchers used methods like concept activation vectors (TCAV) to interpret the model’s predictions. This involved identifying life directions corresponding to different life outcomes or traits. By analyzing these directions, they gained insights into how factors, such as employment status or health diagnoses, influenced the model’s predictions.
Using AI to predict important life events, death undoubtedly being one of the most significant, is a tantalizing prospect.
While its benefits and risks are closely balanced, similar applications have been channeled to a positive end, like this model used to predict adolescent suicide and self-harm. In healthcare in general, predictive modeling is helping prioritize treatments for at-risk groups.
However, as Jørgensen concedes, work is to be done to protect the ethical uses of these technologies.