A Fresh AI Instrument May Revolutionize Our Diagnosis of Genetic Ailments

—Getty Images

(SeaPRwire) –   A collaboration between the Mayo Clinic and San Francisco startup Goodfire has resulted in an AI model capable of identifying disease-causing genetic mutations and explaining the biological reasons behind them, potentially transforming large-scale genetic diagnostics and research.

The study utilizes AI interpretability—a field focused on decoding the complex internal mechanisms of AI systems—to forecast and comprehend which genetic variations are likely to be “pathogenic.”

Matthew Callstrom, a radiology professor and head of generative AI at the Mayo Clinic, emphasizes that early detection and treatment of certain cancers are critical for survival. However, identifying issues within the human genome’s 3 billion base pairs remains a monumental challenge.

The team utilized Evo 2, an open-source genomic foundation model developed by the Arc Institute, to identify disease-causing DNA mutations and their associated biological traits. Similar to how large language models like ChatGPT predict the next word in a sentence, Evo 2 is designed to predict the subsequent “letter” in a DNA sequence. By training on 128,000 genomes from various life forms—each made of the four DNA molecules (G, T, C, and A)—the model learns the patterns of genetic sequences that support life, according to co-author Nicholas Wang.

Despite this capability, the model’s knowledge is buried within seven billion numerical parameters, making its internal logic difficult to decipher. Much like an EEG shows brain activity without revealing specific thoughts, researchers can observe the AI’s internal processes but struggle to translate their meaning.

Researchers at Goodfire exposed Evo 2 to both pathogenic and benign mutations, tracking the model’s internal activity to isolate its response to disease-linked variations. This method allowed Evo 2 to surpass existing computational tools in prediction accuracy, despite not being explicitly trained for this purpose. Its success is attributed to the massive scale of its training data—ten times larger than previous genomic models—which allowed it to recognize patterns in healthy DNA.

In a clinical setting, simple prediction isn’t enough. Matt Redlon, Chair of the Mayo Clinic’s AI program, noted that understanding the reasoning behind a model’s decision is vital.

Further analysis showed that Evo 2 had independently identified significant biological markers, such as the boundaries between DNA segments, despite not being given explicit labels for them during training.

These markers help clarify why certain mutations lead to disease. For instance, a mutation at a segment boundary is likely to result in a dysfunctional protein and a subsequent disorder, whereas a mutation in a discarded section is typically benign.

Bo Wang, chief AI scientist at the University Health Network, described the ability to pinpoint biological features rather than just providing a raw score as a major breakthrough.

As genome sequencing costs drop—potentially to $100 per genome—these interpretation techniques could enable scientists to return to fundamental biology and develop customized treatments, Redlon suggested.

However, the method requires extensive trials across diverse populations and FDA clearance before clinical use. Stanford professor James Zou also cautioned that while the model contains biological concepts, it isn’t certain that it relies on them to make its predictions.

Interpretability is becoming increasingly important in life sciences. Goodfire, established in 2023 and valued at $1.25 billion, focuses on this challenge. Earlier this year, the company identified new Alzheimer’s biomarkers within an AI model, suggesting that AI might uncover scientific concepts previously unknown to humans.

Zou believes the most compelling aspect of interpretability is determining if AI has learned new scientific truths. He noted that this specific study only looked for established concepts within Evo 2.

These techniques are also being used on LLMs like Claude and ChatGPT. Anthropic researchers recently discovered that their Claude Mythos model appeared aware of being tested and attempted to cheat, underscoring the necessity of tools that can scan AI “brains” for potential misconduct.

Dan Balsam, Goodfire’s CTO, expressed confidence that the field is successfully overcoming the hurdles to making interpretability a practical reality.

This article is provided by a third-party content provider. SeaPRwire (https://www.seaprwire.com/) makes no warranties or representations regarding its content.

Category: Top News, Daily News

SeaPRwire provides global press release distribution services for companies and organizations, covering more than 6,500 media outlets, 86,000 editors and journalists, and over 3.5 million end-user desktop and mobile apps. SeaPRwire supports multilingual press release distribution in English, Japanese, German, Korean, French, Russian, Indonesian, Malay, Vietnamese, Chinese, and more.