An Evaluation Methodology of Named Entities Recognition in Spanish Language: ECU 911 Case Study

Marcos Orellana, Andrea Trujillo, Juan-Fernando Lima, María-Inés Acosta, Mario Peña

Abstract


The importance of the gathered information in Integrated Security Services as ECU911 in Ecuador is evidenced in terms of its quality and availability in order to perform decision-making tasks. It is a priority to avoid the loss of relevant information such as event address, places references, names, etc. In this context it is present Named Entity Recognition (NER) analysis for discovering information into informal texts. Unlike structured corpus and labeled for NER analysis like CONLL2002 or ANCORA, informal texts generated from emergency call dialogues have a very wide linguistic variety; in addition, there is a strong tending to lose important information in their processing. A relevant aspect to considerate is the identification of texts that denotes entities such as the physical address where emergency events occurred. This study aims to extract the locations in which an emergency event has been issued. A set of experiments was performed with NER models based on Convolutional Neural Network (CNN). The performance of models was evaluated according to parameters such as training dataset size, dropout rate, location dictionary, and denoting location. An experimentation methodology was proposed, with it follows the next steps: i) Data preprocessing, ii) Dataset labeling, iii) Model structuring, and iv) Model evaluating. Results revealed that the performance of a model improves when having more training data, an adequate dropout rate to control overfitting problems, and a combination of a dictionary of locations and replacing words denoting entities.

Keywords


named entity recognition; Spanish language; emergency calls; informal text.

Full Text:

PDF

References


Ecuador, “Decreto N° 988.” Quito, 2011.

Y. Vikas and B. Steven, “A survey on recent advances in named entity recognition from deep learning models,” in Proceedings of the 27th International Conference on Computational Linguistics, 2018, vol. 59, no. 1, pp. 2145–2158.

X. Liu, Y. Zhou, and Z. Wang, “Recognition and extraction of named entities in online medical diagnosis data based on a deep neural network,” J. Vis. Commun. Image Represent., vol. 60, pp. 1–15, 2019.

J. Linet and C. Zea, “Reconocimiento de entidades nombradas para el idioma espa ˜ nol utilizando Conditional Random Fields con caracter ´ ısticas no supervisadas.”

M. Gridach, “Character-level neural network for biomedical named entity recognition,” J. Biomed. Inform., vol. 70, pp. 85–91, 2017.

R. Gutiérrez, A. Castillo, V. Bucheli, and O. Solarte, “Named Entity Recognition for Spanish language and applications in technology forecasting Reconocimiento de entidades nombradas para el idioma Español y su aplicación en la vigilancia tecnológica,” Rev. Antioqueña las Ciencias Comput. y la Ing. Softw., vol. 5, pp. 43–47, 2015.

M. Won, P. Murrieta-Flores, and B. Martins, “Ensemble Named Entity Recognition (NER): Evaluating NER Tools in the Identification of Place Names in Historical Corpora,” Front. Digit. Humanit., vol. 5, no. March, pp. 1–12, 2018.

M. Khalifa and K. Shaalan, “Character convolutions for Arabic Named Entity Recognition with Long Short-Term Memory Networks,” Comput. Speech Lang., vol. 58, pp. 335–346, 2019.

M. H. Bokaei and M. Mahmoudi, “Improved Deep Persian Named Entity Recognition,” in 9th International Symposium on Telecommunication: With Emphasis on Information and Communication Technology, IST 2018, 2019, pp. 381–386.

L. Derczynski et al., “Analysis of named entity recognition and linking for tweets,” Inf. Process. Manag., vol. 51, no. 2, pp. 32–49, 2015.

I. Moreno and T. Rom, “A Domain and Language Independent Named Entity Classification Approach Based on Profiles and Local Information A Domain and Language Independent Named Entity Classification Approach Based on Profiles and Local Information,” no. September, 2017.

W. G. Aguilar, D. Alulema, A. Limaico, and D. Sandoval, “Development and Verification of a Verbal Corpus Based on Natural Language for Ecuadorian Dialect,” in Proceedings - IEEE 11th International Conference on Semantic Computing, ICSC 2017, 2017, pp. 515–519.

R. Jain, D. S. Anand, and V. Janakiraman, “Scrubbing Sensitive PHI Data from Medical Records made Easy by SpaCy - A Scalable Model Implementation Comparisons,” CoRR, vol. abs/1906.0, 2019.

L. G. Moreno-sandoval, S. Carolina, K. Esp, A. Pomares-quimbaya, and J. C. Garcia, “Spanish Twitter Data Used as a Source of Information About Consumer Food Choice,” in International Cross-Domain Conference for Machine Learning and Knowledge Extraction, 2018, vol. 2, pp. 134–146.

B. Kleinberg, M. Mozes, A. Arntz, and B. Verschuere, “Using Named Entities for Computer- Automated Verbal Deception Detection,” pp. 1–10, 2017.

N. Limsopatham and N. Collier, “Bidirectional LSTM for named entity recognition in ttwitter messages,” in Proceedings of the 2nd Workshop on Noisy User-generated Text ({WNUT}), 2016, pp. 145–152.

G. Lample, M. Ballesteros, S. Subramanian, K. Kawakami, and C. Dyer, “Neural architectures for named entity recognition,” CoRR, vol. CoRR, pp. 260–270, 2016.

V. Yadav, R. Sharp, and S. Bethard, “Deep Affix Features Improve Neural Named Entity Recognizers,” in Proceedings of the Seventh Joint Conference on Lexical and Computational Semantics, 2018, pp. 167–172.

M. H. Bokaei and M. Mahmoudi, “Improved Deep Persian Named Entity Recognition,” 2018 9th Int. Symp. Telecommun., pp. 381–386, 2018.

H. Chen, Z. Lin, G. Ding, J. Lou, Y. Zhang, and B. Karlsson, “GRN: Gated Relation Network to Enhance Convolutional Neural Network for Named Entity Recognition,” in Proceedings of AAAI, 2019.

Q. Lu, Y. Xu, R. Yang, N. Li, and C. Wang, “Serial and Parallel Recurrent Convolutional Neural Networks for Biomedical Named Entity Recognition,” in International Conference on Database Systems for Advanced Applications, 2019, pp. 439--443.

G. Aguilar, A. P. López Monroy, F. González, and T. Solorio, “Modeling Noisiness to Recognize Named Entities using Multitask Neural Networks on Social Media,” in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018, pp. 1401–1412.

R. Jiang, A. Star, and R. E. Banchs, “Evaluating and Combining Named Entity Recognition Systems,” in Proceedings of the Sixth Named Entity Workshop, 2016, pp. 21–27.

Pires André, Devezas José Luís, Nunes Sérgio, A. Pires, J. Devezas, and S. Nunes, “Benchmarking Named Entity Recognition Tools for Portuguese,” Proc. Ninth INForum Simp. Informática, pp. 111--121, 2017.

J. P. C. Chiu and E. Nichols, “Named Entity Recognition with Bidirectional LSTM-CNNs,” Trans. Assoc. Comput. Linguist., vol. 4, no. 2003, pp. 357–370, 2016.

A. Goyal, V. Gupta, and M. Kumar, “Recent Named Entity Recognition and Classification techniques: A systematic review,” Comput. Sci. Rev., vol. 29, pp. 21–39, 2018.

L. Azzopardi, B. Stein, N. Fuhr, P. Mayr, C. Hauff, and D. Hiemstra, “Impact of Training Dataset Size on Neural Answer Selection Models,” in European Conference on Information Retrieval, 2019, vol. 11437, pp. 828–835.

T. M. Ma, Yukun Kim, Jung-jae Bigot, Benjamin, Khan, “Featured-enriched word embeddings for naemd entity recognition in open-domain conversations,” Icassp 2016, pp. 6055–6059, 2016.

N. Srivastava et al., “Dropout: A simple way to prevent neural networks from overfitting,” J. Mach. Learn. Res., vol. 15, pp. 1929–1958, 2014.

D. Bonadiman, A. Severyn, and A. Moschitti, “Deep Neural Networks for Named Entity Recognition in Italian,” Proc. Second Ital. Conf. Comput. Linguist. CLiC-it 2015, pp. 51–55, 2016.

A. Blumer, A. Ehrenfeucht, D. Haussler, and M. K. Warmuth, “Occam’s razor,” in In- formation processing letters, 1987, vol. 24, no. April, pp. 377–380.

F. Ruiz and J. Verdugo, “Guía de Uso de SPEM 2 con EPF Composer,” Univ. Castilla-La Mancha Esc. Super. Informática Dep. Tecnol. y Sist. Inf. Grup. Alarcos, vol. 3, p. 93, 2008.

R. Gutiérrez, A. Castillo, V. Bucheli, and O. Solarte, “Named Entity Recognition for Spanish language and applications in technology forecasting,” Rev. Antioqueña las Ciencias Comput. y la Ing. Softw., vol. 5, pp. 43–47, 2015.

C. A. C. Molina, R. E. Gutierrez, and O. Solarte, “Prototipo para el reconocimiento de entidades nombradas en el idioma Español,” in 2015 10th Colombian Computing Conference, 10CCC 2015, 2015, pp. 364–371.

G. Crichton, S. Pyysalo, B. Chiu, and A. Korhonen, “A neural network multi-task learning approach to biomedical named entity recognition,” BMC Bioinformatics, vol. 18, no. 1, pp. 1–14, 2017.

M. Gridach, “Character-level neural network for biomedical named entity recognition,” J. Biomed. Inform., vol. 70, no. May, pp. 85–91, 2017.

J. Zhang et al., “Enable Automated Emergency Responses Through an Agent-Based Computer-Aided Dispatch System,” in Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, 2018, pp. 1844–1846.

I. Moreno, M. T. Romá-Ferri, and P. Paloma, “A domain and language independent named entity classification approach based on profiles and local information,” in RANLP, 2017, no. September, pp. 510–518.

Docaano, “Text annotation for Human,” 2019. [Online]. Available: https://doccano.herokuapp.com/. [Accessed: 09-Sep-2019].

Al Explosion, “Prodigy,” Named Entity Recognition, 2019. [Online]. Available: https://prodi.gy/features/. [Accessed: 09-Sep-2019].

Neuroner, “NeuroNER,” 2019. [Online]. Available: http://neuroner.com/. [Accessed: 09-Sep-2019].

R. Jiang, R. E. Banchs, and H. Li, “Evaluating and Combining Name Entity Recognition Systems,” pp. 21–27, 2016.

T. Tran and R. Kavuluru, “An end-to-end deep learning architecture for extracting protein-protein interactions affected by genetic mutations,” Database, vol. 2018, no. 2018, pp. 1–13, 2018.

X. Liu and M. Zhou, “Two-stage NER for tweets with clustering,” Inf. Process. Manag., vol. 49, no. 1, pp. 264–273, 2013.

M. Tkachenko and A. Simanovsky, “Named entity recognition: Exploring features,” in Proceedings of KONVENS, 2012, vol. 2012, pp. 118–127.

Y. Zhu, G. Wang, and B. F. Karlsson, “CAN-NER: Convolutional Attention Network for Chinese Named Entity Recognition,” in NAACL-HLT, 2019.

Y. Zeng, H. Yang, and Y. F. B, “A convolution BiLSTM neural network model for Chinese event extraction,” in Natural Language Understanding and Intelligent Applications, vol. 1, 2016, pp. 275–287.




DOI: http://dx.doi.org/10.18517/ijaseit.10.3.10939

Refbacks

  • There are currently no refbacks.



Published by INSIGHT - Indonesian Society for Knowledge and Human Development