Temporal Positive-unlabeled Learning for Biomedical Hypothesis Generation via Risk Estimation

Uchenna Akujuobi,Jun Chen,Mohamed Elhoseiny,Michael Spranger,Xiangliang Zhang

Understanding the relationships between biomedical terms like viruses, drugs, andsymptoms is essential in the fight against diseases. Many attempts have been madeto introduce the use of machine learning to the scientific process of hypothesisgeneration (HG), which refers to the discovery of meaningful implicit connectionsbetween biomedical terms. However, most existing methods fail to truly capturethe temporal dynamics of scientific term relations and also assume unobservedconnections to be irrelevant (i.e., in a positive-negative (PN) learning setting). Tobreak these limits, we formulate this HG problem as future connectivity predictiontask on a dynamic attributed graph via positive-unlabeled (PU) learning. Then,the key is to capture the temporal evolution of node pair (term pair) relationsfrom just the positive and unlabeled data. We propose a variational inferencemodel to estimate the positive prior, and incorporate it in the learning of nodepair embeddings, which are then used for link prediction. Experiment results onreal-world biomedical term relationship datasets and case study analyses on aCOVID-19 dataset validate the effectiveness of the proposed model.