A Hybrid System for Chemical Named Entity Simplification
One explicit challenge in medicine named entity recognition (NER) and normalization is that the identification and resolution of composite named entities, wherever one span refers to over one idea (e.g., compositeBRCA1/2). Previous Named Entity Recognition (NER) and normalization studies have either neglected composite mentions, used straight forward rules or solely handled coordination omission, making a strong approach for handling composites mentions greatly required to the present finish, we tend to propose a hybrid technique integrating a machine-learning model with a pattern identification strategy to spot the individual elements of every composite mention. Our method , that we’ve named SimConcept is the first to consistently handle many sorts of composite mentions. The technique achieves high performance in distinguishing and resolving composite mentions for three key biological entities: genes (90.42% in F-measure), diseases (86.47% in F-measure), and chemicals (86.05% in F-measure).The proposed SimConcept technique will later improve the performance of gene, disease chemicals concept recognition and normalization. We observe that in our datasets, approximately 10% of gene, disease, and chemical mentions are composite mentions, hence, it is important to handle them properly. This study presents a new method for bio-concept mention simplification in a systematic fashion.
Published by: Gunjal Sonali Vishram, Prof. N. B. Kadu
Author: Gunjal Sonali Vishram
Paper ID: V3I2-1504
Paper Status: published
Published: April 14, 2017
Full Details