An Evaluation Methodology of Named Entities Recognition in Spanish Language: ECU 911 Case Study

Marcos Orellana; Andrea Trujillo; Juan-Fernando Lima; MarÃ­a-InÃ©s Acosta; Mario PeÃ±a

doi:10.18517/ijaseit.10.3.10939

An Evaluation Methodology of Named Entities Recognition in Spanish Language: ECU 911 Case Study

Marcos Orellana, Andrea Trujillo, Juan-Fernando Lima, MarÃa-InÃ©s Acosta, Mario PeÃ±a

Abstract

The importance of the gathered information in Integrated Security Services as ECU911 in Ecuador is evidenced in terms of its quality and availability in order to perform decision-making tasks. It is a priority to avoid the loss of relevant information such as event address, places references, names, etc. In this context it is present Named Entity Recognition (NER) analysis for discovering information into informal texts. Unlike structured corpus and labeled for NER analysis like CONLL2002 or ANCORA, informal texts generated from emergency call dialogues have a very wide linguistic variety; in addition, there is a strong tending to lose important information in their processing. A relevant aspect to considerate is the identification of texts that denotes entities such as the physical address where emergency events occurred. This study aims to extract the locations in which an emergency event has been issued. A set of experiments was performed with NER models based on Convolutional Neural Network (CNN). The performance of models was evaluated according to parameters such as training dataset size, dropout rate, location dictionary, and denoting location. An experimentation methodology was proposed, with it follows the next steps: i) Data preprocessing, ii) Dataset labeling, iii) Model structuring, and iv) Model evaluating. Results revealed that the performance of a model improves when having more training data, an adequate dropout rate to control overfitting problems, and a combination of a dictionary of locations and replacing words denoting entities.

Keywords

named entity recognition; Spanish language; emergency calls; informal text.

Full Text:

PDF

References

Ecuador, â€œDecreto NÂ° 988.â€ Quito, 2011.

Y. Vikas and B. Steven, â€œA survey on recent advances in named entity recognition from deep learning models,â€ in Proceedings of the 27th International Conference on Computational Linguistics, 2018, vol. 59, no. 1, pp. 2145â€“2158.

X. Liu, Y. Zhou, and Z. Wang, â€œRecognition and extraction of named entities in online medical diagnosis data based on a deep neural network,â€ J. Vis. Commun. Image Represent., vol. 60, pp. 1â€“15, 2019.

J. Linet and C. Zea, â€œReconocimiento de entidades nombradas para el idioma espa Ëœ nol utilizando Conditional Random Fields con caracter Â´ Ä±sticas no supervisadas.â€

M. Gridach, â€œCharacter-level neural network for biomedical named entity recognition,â€ J. Biomed. Inform., vol. 70, pp. 85â€“91, 2017.

R. GutiÃ©rrez, A. Castillo, V. Bucheli, and O. Solarte, â€œNamed Entity Recognition for Spanish language and applications in technology forecasting Reconocimiento de entidades nombradas para el idioma EspaÃ±ol y su aplicaciÃ³n en la vigilancia tecnolÃ³gica,â€ Rev. AntioqueÃ±a las Ciencias Comput. y la Ing. Softw., vol. 5, pp. 43â€“47, 2015.

M. Won, P. Murrieta-Flores, and B. Martins, â€œEnsemble Named Entity Recognition (NER): Evaluating NER Tools in the Identification of Place Names in Historical Corpora,â€ Front. Digit. Humanit., vol. 5, no. March, pp. 1â€“12, 2018.

M. Khalifa and K. Shaalan, â€œCharacter convolutions for Arabic Named Entity Recognition with Long Short-Term Memory Networks,â€ Comput. Speech Lang., vol. 58, pp. 335â€“346, 2019.

M. H. Bokaei and M. Mahmoudi, â€œImproved Deep Persian Named Entity Recognition,â€ in 9th International Symposium on Telecommunication: With Emphasis on Information and Communication Technology, IST 2018, 2019, pp. 381â€“386.

L. Derczynski et al., â€œAnalysis of named entity recognition and linking for tweets,â€ Inf. Process. Manag., vol. 51, no. 2, pp. 32â€“49, 2015.

I. Moreno and T. Rom, â€œA Domain and Language Independent Named Entity Classification Approach Based on Profiles and Local Information A Domain and Language Independent Named Entity Classification Approach Based on Profiles and Local Information,â€ no. September, 2017.

W. G. Aguilar, D. Alulema, A. Limaico, and D. Sandoval, â€œDevelopment and Verification of a Verbal Corpus Based on Natural Language for Ecuadorian Dialect,â€ in Proceedings - IEEE 11th International Conference on Semantic Computing, ICSC 2017, 2017, pp. 515â€“519.

R. Jain, D. S. Anand, and V. Janakiraman, â€œScrubbing Sensitive PHI Data from Medical Records made Easy by SpaCy - A Scalable Model Implementation Comparisons,â€ CoRR, vol. abs/1906.0, 2019.

L. G. Moreno-sandoval, S. Carolina, K. Esp, A. Pomares-quimbaya, and J. C. Garcia, â€œSpanish Twitter Data Used as a Source of Information About Consumer Food Choice,â€ in International Cross-Domain Conference for Machine Learning and Knowledge Extraction, 2018, vol. 2, pp. 134â€“146.

B. Kleinberg, M. Mozes, A. Arntz, and B. Verschuere, â€œUsing Named Entities for Computer- Automated Verbal Deception Detection,â€ pp. 1â€“10, 2017.

N. Limsopatham and N. Collier, â€œBidirectional LSTM for named entity recognition in ttwitter messages,â€ in Proceedings of the 2nd Workshop on Noisy User-generated Text ({WNUT}), 2016, pp. 145â€“152.

G. Lample, M. Ballesteros, S. Subramanian, K. Kawakami, and C. Dyer, â€œNeural architectures for named entity recognition,â€ CoRR, vol. CoRR, pp. 260â€“270, 2016.

V. Yadav, R. Sharp, and S. Bethard, â€œDeep Affix Features Improve Neural Named Entity Recognizers,â€ in Proceedings of the Seventh Joint Conference on Lexical and Computational Semantics, 2018, pp. 167â€“172.

M. H. Bokaei and M. Mahmoudi, â€œImproved Deep Persian Named Entity Recognition,â€ 2018 9th Int. Symp. Telecommun., pp. 381â€“386, 2018.

H. Chen, Z. Lin, G. Ding, J. Lou, Y. Zhang, and B. Karlsson, â€œGRN: Gated Relation Network to Enhance Convolutional Neural Network for Named Entity Recognition,â€ in Proceedings of AAAI, 2019.

Q. Lu, Y. Xu, R. Yang, N. Li, and C. Wang, â€œSerial and Parallel Recurrent Convolutional Neural Networks for Biomedical Named Entity Recognition,â€ in International Conference on Database Systems for Advanced Applications, 2019, pp. 439--443.

G. Aguilar, A. P. LÃ³pez Monroy, F. GonzÃ¡lez, and T. Solorio, â€œModeling Noisiness to Recognize Named Entities using Multitask Neural Networks on Social Media,â€ in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018, pp. 1401â€“1412.

R. Jiang, A. Star, and R. E. Banchs, â€œEvaluating and Combining Named Entity Recognition Systems,â€ in Proceedings of the Sixth Named Entity Workshop, 2016, pp. 21â€“27.

Pires AndrÃ©, Devezas JosÃ© LuÃs, Nunes SÃ©rgio, A. Pires, J. Devezas, and S. Nunes, â€œBenchmarking Named Entity Recognition Tools for Portuguese,â€ Proc. Ninth INForum Simp. InformÃ¡tica, pp. 111--121, 2017.

J. P. C. Chiu and E. Nichols, â€œNamed Entity Recognition with Bidirectional LSTM-CNNs,â€ Trans. Assoc. Comput. Linguist., vol. 4, no. 2003, pp. 357â€“370, 2016.

A. Goyal, V. Gupta, and M. Kumar, â€œRecent Named Entity Recognition and Classification techniques: A systematic review,â€ Comput. Sci. Rev., vol. 29, pp. 21â€“39, 2018.

L. Azzopardi, B. Stein, N. Fuhr, P. Mayr, C. Hauff, and D. Hiemstra, â€œImpact of Training Dataset Size on Neural Answer Selection Models,â€ in European Conference on Information Retrieval, 2019, vol. 11437, pp. 828â€“835.

T. M. Ma, Yukun Kim, Jung-jae Bigot, Benjamin, Khan, â€œFeatured-enriched word embeddings for naemd entity recognition in open-domain conversations,â€ Icassp 2016, pp. 6055â€“6059, 2016.

N. Srivastava et al., â€œDropout: A simple way to prevent neural networks from overfitting,â€ J. Mach. Learn. Res., vol. 15, pp. 1929â€“1958, 2014.

D. Bonadiman, A. Severyn, and A. Moschitti, â€œDeep Neural Networks for Named Entity Recognition in Italian,â€ Proc. Second Ital. Conf. Comput. Linguist. CLiC-it 2015, pp. 51â€“55, 2016.

A. Blumer, A. Ehrenfeucht, D. Haussler, and M. K. Warmuth, â€œOccamâ€™s razor,â€ in In- formation processing letters, 1987, vol. 24, no. April, pp. 377â€“380.

F. Ruiz and J. Verdugo, â€œGuÃa de Uso de SPEM 2 con EPF Composer,â€ Univ. Castilla-La Mancha Esc. Super. InformÃ¡tica Dep. Tecnol. y Sist. Inf. Grup. Alarcos, vol. 3, p. 93, 2008.

R. GutiÃ©rrez, A. Castillo, V. Bucheli, and O. Solarte, â€œNamed Entity Recognition for Spanish language and applications in technology forecasting,â€ Rev. AntioqueÃ±a las Ciencias Comput. y la Ing. Softw., vol. 5, pp. 43â€“47, 2015.

C. A. C. Molina, R. E. Gutierrez, and O. Solarte, â€œPrototipo para el reconocimiento de entidades nombradas en el idioma EspaÃ±ol,â€ in 2015 10th Colombian Computing Conference, 10CCC 2015, 2015, pp. 364â€“371.

G. Crichton, S. Pyysalo, B. Chiu, and A. Korhonen, â€œA neural network multi-task learning approach to biomedical named entity recognition,â€ BMC Bioinformatics, vol. 18, no. 1, pp. 1â€“14, 2017.

M. Gridach, â€œCharacter-level neural network for biomedical named entity recognition,â€ J. Biomed. Inform., vol. 70, no. May, pp. 85â€“91, 2017.

J. Zhang et al., â€œEnable Automated Emergency Responses Through an Agent-Based Computer-Aided Dispatch System,â€ in Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, 2018, pp. 1844â€“1846.

I. Moreno, M. T. RomÃ¡-Ferri, and P. Paloma, â€œA domain and language independent named entity classification approach based on profiles and local information,â€ in RANLP, 2017, no. September, pp. 510â€“518.

Docaano, â€œText annotation for Human,â€ 2019. [Online]. Available: https://doccano.herokuapp.com/. [Accessed: 09-Sep-2019].

Al Explosion, â€œProdigy,â€ Named Entity Recognition, 2019. [Online]. Available: https://prodi.gy/features/. [Accessed: 09-Sep-2019].

Neuroner, â€œNeuroNER,â€ 2019. [Online]. Available: http://neuroner.com/. [Accessed: 09-Sep-2019].

R. Jiang, R. E. Banchs, and H. Li, â€œEvaluating and Combining Name Entity Recognition Systems,â€ pp. 21â€“27, 2016.

T. Tran and R. Kavuluru, â€œAn end-to-end deep learning architecture for extracting protein-protein interactions affected by genetic mutations,â€ Database, vol. 2018, no. 2018, pp. 1â€“13, 2018.

X. Liu and M. Zhou, â€œTwo-stage NER for tweets with clustering,â€ Inf. Process. Manag., vol. 49, no. 1, pp. 264â€“273, 2013.

M. Tkachenko and A. Simanovsky, â€œNamed entity recognition: Exploring features,â€ in Proceedings of KONVENS, 2012, vol. 2012, pp. 118â€“127.

Y. Zhu, G. Wang, and B. F. Karlsson, â€œCAN-NER: Convolutional Attention Network for Chinese Named Entity Recognition,â€ in NAACL-HLT, 2019.

Y. Zeng, H. Yang, and Y. F. B, â€œA convolution BiLSTM neural network model for Chinese event extraction,â€ in Natural Language Understanding and Intelligent Applications, vol. 1, 2016, pp. 275â€“287.

DOI: http://dx.doi.org/10.18517/ijaseit.10.3.10939

Refbacks

There are currently no refbacks.

Published by INSIGHT - Indonesian Society for Knowledge and Human Development

International Journal on Advanced Science, Engineering and Information Technology

An Evaluation Methodology of Named Entities Recognition in Spanish Language: ECU 911 Case Study

Abstract

Keywords

Full Text:

References

Refbacks