Large-Scale Literature-Based Entity Recognition for COVID-19
Researchers Involved
research areas
COVID-19
Entity recognition
Medical literature
Scientific literature
timeframe
2020 - 2020
contact
fabio.rinaldi@uzh.chBackground
- We have developed over the years efficient and reliable methods for entity recognition across the scientific literature.
- Recently we applied them to publications about COVID19. The methods can be applied to any text discussing medical or biological aspects of a given disease.
Methods
- Our dictionary-based lookup tool OGER is used in conjunction with a pretrained BioBERT model and a vocabulary specific to COVID-19.
- NER and NEN used to be performed sequentially, but performing both simultaneously yields better results.
Project Goal
- ~3000 abstracts per week on PubMed related to COVID-19: We need efficient, reliable tagging of these publications to help health researchers
Results
- Our pipeline tags PMC and PubMed articles for COVID-19 related and other medical concepts, and outputs a variety of formats (EuroPMC, PubAnnotation, BioC…)