How to Adapt Deep Learning Models to a New Domain: The Case of Biomedical Relation Extraction

Peña-Torres, Jefferson A.; Gutiérrez, Raúl E.; Bucheli, Víctor A.; González, Fabio A.

Cómo adaptar un modelo de aprendizaje profundo a un nuevo dominio: el caso de la extracción de relaciones biomédicas

dc.creator	Peña-Torres, Jefferson A.
dc.creator	Gutiérrez, Raúl E.
dc.creator	Bucheli, Víctor A.
dc.creator	González, Fabio A.
dc.date	2019-12-05
dc.identifier	https://revistas.itm.edu.co/index.php/tecnologicas/article/view/1483
dc.identifier	10.22430/22565337.1483
dc.description	In this article, we study the relation extraction problem from Natural Language Processing (NLP) implementing a domain adaptation setting without external resources. We trained a Deep Learning (DL) model for Relation Extraction (RE), which extracts semantic relations in the biomedical domain. However, can the model be applied to different domains? The model should be adaptable to automatically extract relationships across different domains using the DL network. Completely training DL models in a short time is impractical because the models should quickly adapt to different datasets in several domains without delay. Therefore, adaptation is crucial for intelligent systems, where changing factors and unanticipated perturbations are common. In this study, we present a detailed analysis of the problem, as well as preliminary experimentation, results, and their evaluation.	en-US
dc.description	En este trabajo estudiamos el problema de extracción de relaciones del Procesamiento de Lenguaje Natural (PLN). Realizamos una configuración para la adaptación de dominio sin recursos externos. De esta forma, entrenamos un modelo con aprendizaje profundo (DL) para la extracción de relaciones (RE). El modelo permite extraer relaciones semánticas para el dominio biomédico. Sin embargo, ¿El modelo puede ser aplicado a diferentes dominios? El modelo debería adaptarse automáticamente para la extracción de relaciones entre diferentes dominios usando la red de DL. Entrenar completamente modelos DL en una escala de tiempo corta no es práctico, deseamos que los modelos se adapten rápidamente de diferentes conjuntos de datos con varios dominios y sin demora. Así, la adaptación es crucial para los sistemas inteligentes que operan en el mundo real, donde los factores cambiantes y las perturbaciones imprevistas son habituales. En este artículo, presentamos un análisis detallado del problema, una experimentación preliminar, resultados y la discusión acerca de los resultados.	es-ES
dc.format	application/pdf
dc.format	text/xml
dc.format	text/html
dc.language	eng
dc.publisher	Instituto Tecnológico Metropolitano (ITM)	en-US
dc.relation	https://revistas.itm.edu.co/index.php/tecnologicas/article/view/1483/1472
dc.relation	https://revistas.itm.edu.co/index.php/tecnologicas/article/view/1483/1562
dc.relation	https://revistas.itm.edu.co/index.php/tecnologicas/article/view/1483/1567
dc.relation	/ref/D. Zeng, K. Liu, S. Lai, G. Zhou, and J. Zhao, “Relation Classification via Convolutional Deep Neural Network,” in Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, Dublin, 2014, pp. 2335–2344. Avaliable: https://www.aclweb.org/anthology/C14-1220/ [2] Y. Lin, S. Shen, Z. Liu, H. Luan, and M. Sun, “Neural Relation Extraction with Selective Attention over Instances,” in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlín, 2016, vol. 1, pp. 2124–2133. http://dx.doi.org/10.18653/v1/P16-1200 [3] X. Ren et al., “Cotype: Joint extraction of typed entities and relations with knowledge bases,” in Proceedings of the 26th International Conference on World Wide Web, Perth, 2017, pp. 1015–1024. http://doi.org/10.1145/3038912.3052708 [4] K. Toutanova, D. Chen, P. Pantel, H. Poon, P. Choudhury, and M. Gamon, “Representing Text for Joint Embedding of Text and Knowledge Bases,” in Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, 2015, pp. 1499–1509. http://dx.doi.org/10.18653/v1/D15-1174 [5] N. Konstantinova, “Review of relation extraction methods: What is new out there?” in International Conference on Analysis of Images, Social Networks and Texts, Switzerland 2014, pp. 15–28. http://doi.org/10.1007/978-3-319-12580-0_2 [6] N. Kambhatla, “Combining lexical, syntactic, and semantic features with maximum entropy models for extracting relations,” in Proceedings of the ACL 2004 on Interactive poster and demonstration sessions -, Barcelona, 2004, pp. 1 - 4. https://doi.org/10.3115/1219044.1219066 [7] R. C. Bunescu and R. J. Mooney, “A shortest path dependency kernel for relation extraction,” in Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing - HLT ’05, Vancouver, 2005, pp. 724–731. Avaliable: https://www.aclweb.org/anthology/H05-1091/ [8] R. J. Mooney and R. C. Bunescu, “Subsequence kernels for relation extraction,” in Advances in neural information processing systems, 2006, pp. 171–178. Avaliable: http://papers.nips.cc/paper/2787-subsequence-kernels-for-relation-extraction.pdf [9] M. Banko, M. J. Cafarella, S. Soderland, M. Broadhead, and O. Etzioni, “Open information extraction from the web.,” in IJCAI, 2007, vol. 7, pp. 2670–2676. Avaliable: https://www.aaai.org/Papers/IJCAI/2007/IJCAI07-429.pdf [10] R. Socher, B. Huval, C. D. Manning, and A. Y. Ng, “Semantic compositionality through recursive matrix-vector spaces,” Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Jeju Island, 2012, pp. 1201–1211. Avaliable: https://www.aclweb.org/anthology/D12-1110/ [11] D. Zhang and D. Wang, “Relation Classification: CNN or RNN?,” in Natural Language Understanding and Intelligent Applications, Springer, Kunming, 2016, pp. 665–675. https://doi.org/10.1007/978-3-319-50496-4_60 [12] S. Lim and J. Kang, “Chemical–gene relation extraction using recursive neural network,” Database, vol. 2018, Jun. 2018. https://doi.org/10.1093/database/bay060 [13] Y. Xu, L. Mou, G. Li, Y. Chen, H. Peng, and Z. Jin, “Classifying relations via long short term memory networks along shortest dependency paths,” in proceedings of the 2015 conference on empirical methods in natural language processing, Lisboa, 2015, pp. 1785–1794. http://doi.org/10.18653/v1/d15-1206 [14] S. Zhang, D. Zheng, X. Hu, and M. Yang, “Bidirectional long short-term memory networks for relation classification,” in Proceedings of the 29th Pacific Asia conference on language, information and computation, Shanghai, 2015, pp. 73–78. Avaliable: https://www.aclweb.org/anthology/Y15-1009.pdf [15] R. Zhang, F. Meng, Y. Zhou, and B. Liu, “Relation classification via recurrent neural network with attention and tensor layers,” Big Data Min. Anal., vol. 1, no. 3, pp. 234–244, Sep. 2018. https://doi.org/10.26599/BDMA.2018.9020022 [16] T. H. Nguyen and R. Grishman, “Relation Extraction: Perspective from Convolutional Neural Networks,” in Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing, Denver, 2015, pp. 39–48. https://doi.org/10.3115/v1/W15-1506 [17] C. dos Santos, B. Xiang, and B. Zhou, “Classifying Relations by Ranking with Convolutional Neural Networks,” in Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Beijing, 2015, pp. 626–634. https://doi.org/10.3115/v1/P15-1061 [18] K. Xu, Y. Feng, S. Huang, and D. Zhao, “Semantic relation classification via convolutional neural networks with simple negative sampling,” in Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, 2015, pp. 536-540. http://doi.org/10.18653/v1/d15-1062 [19] A. Airola, S. Pyysalo, J. Björne, T. Pahikkala, F. Ginter, and T. Salakoski, “All-paths graph kernel for protein-protein interaction extraction with evaluation of cross-corpus learning,” BMC Bioinformatics, vol. 9, no. S 2, pp. 1-12, Nov. 2008. https://doi.org/10.1186/1471-2105-9-S11-S2 [20] S. Kim, J. Yoon, J. Yang, and S. Park, “Walk-weighted subsequence kernels for protein-protein interaction extraction,” BMC Bioinformatics, vol. 11, no. 107, pp. 112–119, Feb. 2010. https://doi.org/10.1186/1471-2105-11-107 [21] I. Segura-Bedmar, P. Martinez, and C. de Pablo-Sánchez, “Using a shallow linguistic kernel for drug–drug interaction extraction,” J. Biomed. Inform., vol. 44, no. 5, pp. 789–804, Oct. 2011. https://doi.org/10.1016/j.jbi.2011.04.005 [22] Y. Zhang, H. Lin, Z. Yang, J. Wang, and Y. Li, “A single kernel-based approach to extract drug-drug interactions from biomedical literature,” PLoS One, vol. 7, no. 11, pp. e48901, Nov. 2012. https://doi.org/10.1371/journal.pone.0048901 [23] K. Hashimoto, M. Miwa, Y. Tsuruoka, and T. Chikayama, “Simple customization of recursive neural networks for semantic relation classification,” in Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, Seattle, 2013, pp. 1372–1376. Avaliable: https://www.aclweb.org/anthology/D13-1137/ [24] Y. Shen and X. Huang, “Attention-based convolutional neural network for semantic relation extraction,” in Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, Osaka, 2016, pp. 2526–2536. Avaliable: https://www.aclweb.org/anthology/C16-1238/ [25] L. Wang, Z. Cao, G. de Melo, and Z. Liu, “Relation classification via multi-level attention cnns,” in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Berlin, 2016, vol. 1, pp. 1298–1307. http://doi.org/10.18653/v1/P16-1123 [26] J. Lee, S. Seo, and Y. S. Choi, “Semantic Relation Classification via Bidirectional LSTM Networks with Entity-aware Attention using Latent Entity Typing,” Symmetry, vol. 11, no. 6, Jun. 2019. https://doi.org/10.3390/sym11060785 [27] M. Xiao and C. Liu, “Semantic relation classification via hierarchical recurrent neural network with attention,” in Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, Osaka, 2016, pp. 1254–1263. Avaliable: https://www.aclweb.org/anthology/C16-1119/ [28] P. Zhou et al., “Attention-based bidirectional long short-term memory networks for relation classification,” in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Berlin, 2016, pp. 207–212. http://doi.org/10.18653/v1/p16-2034 [29] R. Cai, X. Zhang, and H. Wang, “Bidirectional recurrent convolutional neural network for relation classification,” in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Berlin, 2016, pp. 756–765. http://doi.org/10.18653/v1/p16-1072 [30] Y. Xu et al., “Improved relation classification by deep recurrent neural networks with data augmentation,” ArXiv Prepr., Oct. 2016. Available: https://arxiv.org/abs/1601.03651 [31] Y. Liu, F. Wei, S. Li, H. Ji, M. Zhou, and H. Wang, “A dependency-based neural network for relation classification,” ArXiv Prepr., pp.1-10, Jul. 2015. Available: https://arxiv.org/pdf/1507.04646.pdf [32] M. Yu, M. Gormley, and M. Dredze, “Factor-based compositional embedding models.” In NIPS Workshop on Learning Semantics, 2014, pp. 95-101. Available: https://www.cs.cmu.edu/~mgormley/papers/yu+gormley+dredze.nipsw.2014.pdf [33] S. Lai, L. Xu, K. Liu, and J. Zhao, “Recurrent convolutional neural networks for text classification,” in Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, Austin, 2015, pp. 2267-2273. Available: https://dl.acm.org/citation.cfm?id=2886636 [34] D. Zeng, K. Liu, Y. Chen, and J. Zhao, “Distant supervision for relation extraction via piecewise convolutional neural networks,” in Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, 2015, pp. 1753–1762. http://dx.doi.org/10.18653/v1/D15-1203 [35] S. Pawar, G. K. Palshikar, and P. Bhattacharyya, “Relation Extraction: A Survey,” ArXiv Prepr. ArXiv171205191, Dec. 2017. Available: https://arxiv.org/pdf/1712.05191.pdf [36] J. Legrand et al., “PGxCorpus: A Manually Annotated Corpus for Pharmacogenomics,” bioRxiv, Jan. 2019. https://doi.org/10.1101/534388 [37] Y. LeCun, L. Bottou, Y. Bengio, P. Haffner, “Gradient-based learning applied to document recognition,” Proc. IEEE, vol. 86, no. 11, pp. 2278–2324, Nov. 1998. http://doi.org/10.1109/5.726791 [38] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, “Distributed representations of words and phrases and their compositionality,” in Advances in neural information processing systems, Lake Tahoe, Nevada, 2013, pp. 3111–3119. Available: https://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf [39] J. Pennington, R. Socher, and C. Manning, “Glove: Global Vectors for Word Representation,” in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 2014, pp. 1532–1543. http://dx.doi.org/10.3115/v1/D14-1162 [40] J. Turian, L. Ratinov, and Y. Bengio, “Word representations: a simple and general method for semi-supervised learning,” in Proceedings of the 48th annual meeting of the association for computational linguistics, Uppsala, 2010, pp. 384–394. Avaliable: https://www.aclweb.org/anthology/P10-1040/ [41] P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov, “Enriching word vectors with subword information,” Trans. Assoc. Comput. Linguist., vol. 5, pp. 135–146, Jun. 2017. http://dx.doi.org/10.1162/tacl_a_00051 [42] S. Pyysalo et al., “BioInfer: a corpus for information extraction in the biomedical domain,” BMC Bioinformatics, vol. 8, no. 50, Feb. 2007. https://doi.org/10.1186/1471-2105-8-50 [43] H. Gurulingappa, A. M. Rajput, A. Roberts, J. Fluck, M. Hofmann-Apitius, and L. Toldo, “Development of a benchmark corpus to support the automatic extraction of drug-related adverse effects from medical case reports,” J. Biomed. Inform., vol. 45, no. 5, pp. 885–892, Oct. 2012. http://dx.doi.org/10.1016/j.jbi.2012.04.008 [44] J. Kringelum, S. K. Kjaerulff, S. Brunak, O. Lund, T. I. Oprea, and O. Taboureau, “ChemProt-3.0: a global chemical biology diseases mapping,” Database, Feb. 2016. https://doi.org/10.1093/database/bav123 [45] I. Hendrickx et al., “Semeval-2010 task 8: Multi-way classification of semantic relations between pairs of nominals” in Proceedings of the Workshop on Semantic Evaluations, Uppsala, 2010, pp. 33–38. Avaliable: https://www.aclweb.org/anthology/S10-1006/ [46] B. Hachey, C. Grover, and R. Tobin, “Datasets for generic relation extraction” Nat. Lang. Eng., vol. 18, no. 1, pp. 21–59, Jan. 2012. https://doi.org/10.1017/s1351324911000106 [47] T. Ming Harry Hsu, W. Yu Chen, C.-A. Hou, Y.-H. Hubert Tsai, Y.-R. Yeh, and Y.-C. Frank Wang, “Unsupervised domain adaptation with imbalanced cross-domain data,” in Proceedings of the IEEE International Conference on Computer Vision, Santiago de chile, 2015, pp. 4121–4129. http://doi.org/10.1109/iccv.2015.469
dc.rights	Copyright (c) 2019 TecnoLógicas	en-US
dc.rights	http://creativecommons.org/licenses/by-nc-sa/4.0	en-US
dc.source	TecnoLógicas; Vol. 22 (2019): Special issue-2019; 49-62	en-US
dc.source	TecnoLógicas; Vol. 22 (2019): Edición especial-2019; 49-62	es-ES
dc.source	2256-5337
dc.source	0123-7799
dc.subject	Semantic Extraction	en-US
dc.subject	Deep Learning	en-US
dc.subject	Relation Extraction	en-US
dc.subject	Natural Language Processing	en-US
dc.subject	Extracción semántica	es-ES
dc.subject	Aprendizaje profundo	es-ES
dc.subject	Extracción de relaciones	es-ES
dc.subject	Procesamiento de lenguaje natural	es-ES
dc.title	How to Adapt Deep Learning Models to a New Domain: The Case of Biomedical Relation Extraction	en-US
dc.title	Cómo adaptar un modelo de aprendizaje profundo a un nuevo dominio: el caso de la extracción de relaciones biomédicas	es-ES
dc.type	info:eu-repo/semantics/article
dc.type	info:eu-repo/semantics/publishedVersion
dc.type	Research Papers	en-US
dc.type	Artículos de investigación	es-ES

Ficheros en el ítem

Ficheros	Tamaño	Formato	Ver
No hay ficheros asociados a este ítem.

Este ítem aparece en la(s) siguiente(s) colección(ones)

tecnologia [520]

Mostrar el registro sencillo del ítem