Distributed Representation of Entity Mentions Within and Across Multiple Text Documents

Authors

  • Aliakbar Keshtkaran
  • Siti Sophiayati Yuhaniz
  • Mohammad Reza Rostami

Keywords:

Coreference Resolution, Cross-Document Coreference Resolution, Distributed Representation of Words, Information Extraction, Natural Language Processing

Abstract

Regarding to the importance of entities as a base of information for several NLP applications, Cross- Document Coreference Entity Resolution (CDCR) provides techniques for the identification of textual mentions of entities and clustering co-referent mentions across multiple documents. In such context, while prior works employ Knowledge Bases (KB) as a structured information resource to enrich the context of mentions, however these methods have limitations with KB’s unknown entities, with effects on the accuracy and performance of the task. Accordingly, this paper presents a new approach to improve the state-of-the-art by concentration on the knowledge provided by the input text of the mentions, regardless of any external knowledge resource. For this purpose, we first construct the context of mentions using the sequence of informative words around the mention (known as content-words). Furthermore, by abstraction of the mention vector representation to a limited size using an artificial neural network technique of continuous representation of words (i.e. Word2Vec), we reduce the computational cost of the co-referring mentions sub-task. By analyzing the results of experiments with two datasets, significant gains in the accuracy of CDCR as well as run-time efficiency are achieved, compared to the best prior methods.

 

 

Author Biographies

Aliakbar Keshtkaran

 

 

Siti Sophiayati Yuhaniz

 

 

Mohammad Reza Rostami

 

 

Downloads

Published

2019-12-25

How to Cite

Keshtkaran, A. ., Yuhaniz, S. S. ., & Rostami, M. R. . (2019). Distributed Representation of Entity Mentions Within and Across Multiple Text Documents. Open International Journal of Informatics, 7(Special Issue 1), 35–46. Retrieved from https://oiji.utm.my/index.php/oiji/article/view/114