Command Palette
Search for a command to run...
Semantic Enrichment of Pretrained Embedding Output for Unsupervised IR
Semantic Enrichment of Pretrained Embedding Output for Unsupervised IR
Giorgos Stamou Chrysoula Zerva Alexios Mandalios Konstantinos Thomas Giorgos Filandrianos Edmund Dervakos
Abstract
The rapid growth of scientific literature in the biomedical and clinical domain has significantly com-plicated the identification of information of interest by researchers as well as other practitioners. Moreimportantly, the rapid emergence of new topics and findings, often hinders the performance of super-vised approaches, due to the lack of relevant annotated data. The global COVID-19 pandemic furtherhighlighted the need to query and navigate uncharted ground in the scientific literature in a promptand efficient way.In this paper we investigate the potential of semantically enhancing deep transformer architecturesusing SNOMED-CT in order to answer user queries in an unsupervised manner. Our proposed systemattempts to filter and re-rank documents related to a query that were initially retrieved using BERTmodels. To achieve that, we enhance queries and documents with SNOMED-CT concepts and then im-pose filters on concept co-occurrence between them. We evaluate this approach on OHSUMED datasetand show competitive performance and we also present our approach for adapting such an approach tofull papers, such as kaggle’s CORD-19 full-text dataset challenge.