Automatic Indexing of Journal Abstracts with Latent Semantic Analysis
Experimental IR Meets Multilinguality, Multimodality, and Interaction,
Jan 2015
Abstract
The BioASQ ``Task on Large-Scale Online Biomedical Semantic Indexing'' charges participants with assigning semantic tags to biomedical journal abstracts. We present a system that takes as input a biomedical abstract and uses latent semantic analysis to identify similar documents in the MEDLINE database. The system then uses a novel ranking scheme to select a list of MeSH tags from candidates drawn from the most similar documents. Our approach achieved better than baseline performance in both precision and recall. We suggest several possible strategies to improve the system's performance.Add the full text or supplementary notes for the publication here using Markdown formatting.