Skip to content

Large-Scale Literature-Based Entity Recognition for COVID-19

Researchers Involved

Dr. Fabio Rinaldi

research areas

COVID-19
Entity recognition
Medical literature
Scientific literature

timeframe

2020 - 2020

Background

  • We have developed over the years efficient and reliable methods for entity recognition across the scientific literature.
  • Recently we applied them to publications about COVID19. The methods can be applied to any text discussing medical or biological aspects of a given disease.

Methods

  • Our dictionary-based lookup tool OGER is used in conjunction with a pretrained BioBERT model and a vocabulary specific to COVID-19.
  • NER and NEN used to be performed sequentially, but performing both simultaneously yields better results.

Project Goal

  • ~3000 abstracts per week on PubMed related to COVID-19: We need efficient, reliable tagging of these publications to help health researchers

Results

  • Our pipeline tags PMC and PubMed articles for COVID-19 related and other medical concepts, and outputs a variety of formats (EuroPMC, PubAnnotation, BioC…)