Skip to content

Large-Scale Literature-Based Entity Recognition for COVID-19

Researchers Involved

Dr. Fabio Rinaldi

research areas

Entity recognition
Medical literature
Scientific literature


2020 - 2020


  • We have developed over the years efficient and reliable methods for entity recognition across the scientific literature.
  • Recently we applied them to publications about COVID19. The methods can be applied to any text discussing medical or biological aspects of a given disease.


  • Our dictionary-based lookup tool OGER is used in conjunction with a pretrained BioBERT model and a vocabulary specific to COVID-19.
  • NER and NEN used to be performed sequentially, but performing both simultaneously yields better results.

Project Goal

  • ~3000 abstracts per week on PubMed related to COVID-19: We need efficient, reliable tagging of these publications to help health researchers


  • Our pipeline tags PMC and PubMed articles for COVID-19 related and other medical concepts, and outputs a variety of formats (EuroPMC, PubAnnotation, BioC…)