Classifying Clinical Evidence Levels of Cancer Variants in Biomedical Literature Using Machine Learning.
Tool / method
LLMs (GPT-4.1-mini, Gemini 2.5 Flash) for automated CIViC evidence level classification
Summary
This study benchmarks two LLMs (GPT-4.1-mini, Gemini 2.5 Flash) and two classical ML algorithms (decision tree, XGBoost) for automatically classifying scientific publications according to CIViC (Clinical Interpretation of Variants in Cancer) evidence levels. Zero-shot and few-shot prompting strategies were tested for LLMs, compared to TF-IDF and word embedding representations for traditional ML approaches. The aim is to automate bibliographic curation to accelerate variant interpretation in precision oncology.
Synthesis written by Geno'X. For the full original abstract, please refer to the source publication.
Analysis
Automated bibliographic curation for variant interpretation is a major bottleneck in molecular oncology, and increasingly in constitutional genetics. This work demonstrates that recent LLMs via few-shot prompting can effectively assist this task. A concrete step toward automating the bibliographic surveillance workflow for variant classification — a challenge directly relevant to databases like ClinVar or InSiGHT.
Why this score?
Clinical impact: 2/3 · Evidence strength: 2/3 · Novelty: 1/2 · Sample size: 0/1 · Journal quality: 0/1 → Total: 5/10
Keywords
Every Wednesday · Annotated selection · Free · Unsubscribe anytime