Phenotypic prediction of missense variants via deep contrastive learning
Tool / method
N/A (5.1 million missense variants covered)
PheMART: deep contrastive learning integrating protein language models (PLM), protein-protein interactions, protein domains, medical knowledge graphs and EHR data → joint projection of variants + phenotypes in metric space → variant-phenotype associations for 4,179 HPO phenotypes
Summary
PheMART is a deep contrastive learning model that predicts phenotypic consequences of missense variants by associating each variant with a spectrum of HPO phenotypes. It integrates protein language models (PLM), protein-protein interaction data, and electronic health record (EHR) data. By jointly projecting variants and phenotypes in a shared metric space, PheMART outperforms AlphaMissense, CADD and REVEL on phenotype-specific pathogenicity benchmarks (4,179 phenotypes). A database of predictions for 5.1 million pathogenic variants is available open-source (GitHub celehs/PheMART).
Synthesis written by Geno'X. For the full original abstract, please refer to the source publication.
Analysis
PheMART crosses the threshold from binary pathogenic/benign prediction to phenotype-specific prediction — revolutionary for VUS interpretation in precise clinical context. EHR integration is particularly innovative for adapting prediction to the patient's clinical profile. The 5.1M variant database is directly integrable into tertiary annotation pipelines.
Why this score?
Nature Biomed Eng (top journal) +3; phenotype-specific (beyond binary prediction) +2; 5.1M variants + 4,179 phenotypes +2; innovative EHR integration +1; open-source +1
Keywords
Every Wednesday · Annotated selection · Free · Unsubscribe anytime