In a recent analysis, ChatGPT has not yet proven capable of passing the chartered financial analyst (CFA) exam.
A group consisting of JPMorgan Chase & Co. researchers conducted an experiment to see if OpenAI’s GPT-4 model could potentially pass the first two levels of the CFA exam. This challenging exam usually takes humans four years to achieve.
“Based on estimated pass rates and average self-reported scores, we concluded that ChatGPT would likely not be able to pass the CFA Level I and Level II under all tested settings,” the researchers detailed in their report.
However, GPT-4 had a better chance, with researchers stating, “GPT-4 would have a decent chance of passing the CFA Level I and II if prompted appropriately.”
The researchers, including JPMorgan’s AI Research unit members Sameena Shah and Antony Papadimitriou, also highlighted the CFA Institute’s ongoing efforts to integrate AI and big data analysis into its exams since 2017.
Chris Wiese, the CFA Institute’s education managing director, acknowledged that while large-language models (LLMs) like GPT-4 can answer certain exam questions correctly, the path to becoming CFA-certified also requires substantial practical experience, references, ethical standards, and, soon, practical skills modules.
Recently, the pass rate for Level I dipped to 37% in August from an already low 43% average in 2018.
The study revealed that both AI models faced more challenges with Level II, regardless of the prompting methods used.
However, they showed proficiency in derivatives, alternative investments, corporate issues, equity investments, and ethics sections of Level I. Their performance was less impressive in areas like financial reporting and portfolio management.
For Level II, ChatGPT found difficulty with alternative investments and fixed income, whereas GPT-4 struggled more with portfolio management and economics.
Most of ChatGPT’s mistakes were knowledge-based, while GPT-4’s were predominantly calculation and reasoning errors, with the latter sometimes leading to incorrect conclusions due to flawed logic.
This follows a similar recent study that exposed ChatGPT’s limitations in accounting exams. It is a language model, after all.