Imagine going to the doctor with concerning neurological symptoms like sudden dizziness, double vision, and trouble chewing. Your physician takes a detailed history and performs a thorough exam, documenting their findings in your medical record. But what if, before you even undergo a brain scan, an artificial intelligence program could analyze those clinical notes and accurately pinpoint the location of a potential stroke?
That’s exactly what researchers from SUNY Downstate Health Sciences University and Yale University set out to investigate in a new study published in the journal Neurology. They put GPT-4, a state-of-the-art AI language model, to the test, feeding it nearly 50 published case reports of strokes and asking it to determine the location of each patient’s brain lesion based solely on the clinical information provided.
“Not everyone with stroke has access to brain scans or neurologists, so we wanted to determine whether GPT-4 could accurately locate brain lesions after stroke based on a person’s health history and a neurologic exam,” says study author Dr. Jung-Hyun Lee, of State University of New York (SUNY) Downstate Health Sciences University in Brooklyn and a member of the American Academy of Neurology, in a media release.
Without any specialized medical training, GPT-4 was able to correctly identify the broad region of the brain (like the cerebral hemispheres, brainstem, or spinal cord) affected by the stroke about 85 percent of the time. It had a slightly harder time pinpointing the exact side of the brain, but still got it right in roughly three-quarters of cases.
As a large language model, GPT-4 has been trained on a vast amount of text data spanning many topics. This exposure allows it to understand context and relationships between words in a way that mimics human reasoning. When given a patient’s history and exam findings, GPT-4 can correlate specific deficits, like one-sided weakness or vision problems, with the brain structures that control those functions.
For example, in one case, a 55-year-old man suddenly developed dizziness, double vision, trouble closing his eyes and chewing, and mild incoordination on one side. GPT-4 methodically worked through these symptoms, attributing the vision problems to specific eye movement control centers in the brainstem, the chewing difficulty to the nerves that supply the face, and the clumsiness to disruption of cerebellar pathways. Putting the pieces together, it concluded the man had multiple lesions affecting both sides of the brainstem — exactly what the actual brain MRI showed.
Researchers stress that GPT-4 is not perfect and can still make mistakes, especially when given incomplete information or when a case has confounding factors. In some instances, it ignored key findings or made logical missteps. The cerebellum, in particular, seemed to give it more trouble, possibly because the case descriptions there were limited.
Still, the scientists see tremendous potential for AI language tools like GPT-4 to assist with neurological diagnosis in the future, especially in resource-limited settings. Many smaller hospitals lack round-the-clock access to neurologists, and a program that could quickly generate a likely stroke location from an ER physician’s notes could help guide time-sensitive treatment decisions or the need for a specialist consult.
On a broader scale, natural language processing AI could be a game-changer for disorders like Parkinson’s disease that rely heavily on clinical pattern recognition over an extended time and are frequently misdiagnosed early on when a patient may be seeing a non-neurologist. An AI that could pick up on the subtle clues across multiple visits and suggest when it’s time for a neurology referral could result in faster diagnosis and treatment.
Researchers caution there is much more work to be done before a tool like GPT-4 is ready for primetime in neurology clinics. Future medical AI models would need more rigorous testing on full, real-world medical records, not just curated case reports. Serious thought must also be given to issues of patient privacy, safety, and liability. And the tools themselves will need further refinement to improve their knowledge base and reduce illogical errors.
“While not yet ready for use in the clinic, large language models such as generative pre-trained transformers have the potential not only to assist in locating lesions after stroke, they may also reduce health care disparities because they can function across different languages,” explains Dr. Lee. “The potential for use is encouraging, especially due to the great need for improved health care in underserved areas across multiple countries where access to neurologic care is limited.”
But this study offers an exciting glimpse into how AI could become a powerful aid for physicians in the complex world of diagnosing brain and nervous system disorders. In the future, the first “consult” on a challenging neurological case may very well be with an artificial intelligence like GPT-4.