Сетевой подход к визуализации эволюции исследования межъязыкового семантического сходства

Хакимова Аида Хатифовна

doi:doi:10.30987/conferencearticle_5fce2773d960b0.37534641

Главная / Конференции / Международная конференция «Физико-техническая информатика - CPT2020» / Международная конференция "Физико-техническая информатика - CPT2020" Том 1

Сетевой подход к визуализации эволюции исследования межъязыкового семантического сходства

Отправить рукопись Скачать PDF
Текст

Цитировать

СЕТЕВОЙ ПОДХОД К ВИЗУАЛИЗАЦИИ ЭВОЛЮЦИИ ИССЛЕДОВАНИЯ МЕЖЪЯЗЫКОВОГО СЕМАНТИЧЕСКОГО СХОДСТВА

Секция: 3. СОЦИОЭКОНОМИЧЕСКИЕ ТЕХНОЛОГИИ

Сборник: МЕЖДУНАРОДНАЯ КОНФЕРЕНЦИЯ "ФИЗИКО-ТЕХНИЧЕСКАЯ ИНФОРМАТИКА - CPT2020" Том 1

УДК 81 Лингвистика. Языкознание. Языки

BISAC LAN009000 Linguistics / General

Хакимова Аида Хатифовна ¹

Информация об авторах и публикации

Авторы:

1. Камский институт; АНО «Научно-исследовательский Центр физико-технической информатики» (доцент; вед. научный сотрудник)
Набережные Челны, Республика Татарстан, Россия

Тип:

Статья конференции

DOI:

https://doi.org/10.30987/conferencearticle_5fce2773d960b0.37534641

Страницы:

с 314 по 321

Опубликовано:

07.12.2020

Классификаторы:

УДК 81 Лингвистика. Языкознание. Языки
BISAC LAN009000 Linguistics / General

Язык материала:

английский

Ключевые слова:

интеллектуальный анализ текста, технический анализ, межъязыковое семантическое сходство, визуализация, научная сеть, библиометрия

Аннотация и ключевые слова

Аннотация:
Статья посвящена проблеме библиометрического исследования публикаций по теме «Межъязыковое семантическое сходство», имеющихся в базе данных Dimensions. Визуализация научных сетей показала фрагментарность исследований, ограниченное взаимодействие организаций. Выделены страны-лидеры, ведущие организации и авторы. Визуализация наложения позволила нам оценить тенденции цитирования авторов. Показано расширение географии исследований. Для международного сотрудничества важна единообразие семантических подходов к описанию концепций критической инфраструктуры, инцидентов, ресурсов и сервисов, связанных с их обслуживанием и защитой. Изложенные подходы могут быть применены для визуализации и моделирования технологического развития в современном цифровом мире. Семантическое сходство - давняя проблема обработки естественного языка (NLP). Семантическое сходство между двумя словами представляет собой семантическую близость (или семантическое расстояние) между двумя словами или понятиями. Это важная проблема при обработке естественного языка, поскольку она играет важную роль в поиске информации, извлечении информации, интеллектуальном анализе текста, веб-интеллектуальном анализе и многих других приложениях.

Ключевые слова:
интеллектуальный анализ текста, технический анализ, межъязыковое семантическое сходство, визуализация, научная сеть, библиометрия

Текст

Текст (PDF): Читать Скачать

Список литературы

1. Rajat Pandit, R., Sengupta, S., Naskar, S.K., Dash, N.S. and Sardar, M.M. (2019). Improving Semantic Similarity with Cross-Lingual Resources: A Study in Bangla - A Low Resourced Language. Informatics, 6, 19; doihttps://doi.org/10.3390/informatics6020019

2. Vulic, I., De Smet, W., and Moens, M.-F. (2011). Identifying word translations from comparable corpora using latent topic models. In Proceedings of ACL, pages 479-484.

3. Prochasson, E. and Fung, P. (2011). Rare word translation extraction from aligned comparable documents. In Proceedings of ACL, pages 1327-1335.

4. Hotho, A., Nürnberger, A. and Paaß, G. (2005). A brief survey of text mining. In Ldv Forum, Vol. 20(1), p. 19-62.

5. Hassani, H., Beneki, C., Unger, S., Mazinani, M.T. and Yeganegi, M.R. (2020). Text Mining in Big Data Analytics. Big Data Cogn. Comput. 2020, 4, 1; doihttps://doi.org/10.3390/bdcc4010001.

6. Porter, A. L. (2005). Tech Mining. Competitive Intelligence Magazine. 8 (1): 30-37.

7. Ali, A., Alfayez, F. and Alquhayz, H. (2018). Semantic Similarity Measures Between Words: A Brief Survey. Sci. Int. (Lahore),30(6), 907-914, 2018.

8. Wang, H. C., Chi, Y. C. and Hsin, P. L. (2018). Constructing Patent Maps Using Text Mining to Sustainably Detect Potential Technological Opportunities. Sustainability, 10, 3729; doihttps://doi.org/10.3390/su10103729.

9. Grappiolo, C., van Gerwen, E., Verhoosel, J. and Somers, L. (2019). The Semantic Snake Charmer Search Engine: A Tool to Facilitate Data Science in High-tech Industry Domains. In Proceedings of the 2019 Conference on Human Information Interaction and Retrieval (CHIIR ’19). Association for Computing Machinery, New York, NY, USA, 355-359. DOI:https://doi.org/10.1145/3295750.3298915.

10. Jarmasz, M. and Szpakowicz, S. (2003). Roget’s Thesaurus and Semantic Similarity. Recent Adv. Nat. Lang. Process. III Sel. Pap. from RANLP , vol. 111, 2004.

11. Islam, A. and Inkpen, D. (2012). Unsupervised Near-Synonym Choice using the Google Web 1T. ACM Trans. Knowl. Discov. Data, vol. V, no. June, pp. 1-19.

12. O’Shea, J., Bandar, Z., Crockett, K., and McLean, D. (2008). A Comparative Study of Two Short Text Semantic Similarity Measures. In Agent and Multi-Agent Systems: Technologies and Applications, vol. 4953, N. Nguyen, G. Jo, R. Howlett, and L. Jain, Eds. Springer Berlin Heidelberg, pp. 172-181.

13. Li, H. and Xu, J. (2014). Semantic matching in search. Foundations and Trends in Information Retrieval, 7(5):343-469.

14. Mitchell, J. and Lapata, M. (2010). Composition in distributional models of semantics. Cognitive science, 34(8), 1388-1429.

15. Chen, B. (2009). Latent topic modelling of word co-occurence information for spoken document retrieval. In IEEE International Conference on Acoustics, Speech and Signal Processing ICASSP 2009, no. 2, pp. 3961-3964.

16. Kenter, T., Rijke, M. de (2015). Short Text Similarity with Word Embeddings. CIKM '15 Proceedings of the 24th ACM International on Conference on Information and Knowledge Management October 19-23, Melbourne, Australia. Pp. 1411-1420.

17. Atoum, I. (2016). Efficient Hybrid Semantic Text Similarity using Wordnet and a Corpus. (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 7, No. 9, pp.124-130.

18. Magerman, T., Van Looy, B., Baesens, B. and Debackere, K. (2011). Assessment of Latent Semantic Analysis (LSA) text mining algorithms for large scale mapping of patent and scientific publication documents. Department Of Managerial Economics, Strategy And Innovation (MSI), October, 77 р.

19. Marelli, M., Bentivogli, L., Baroni, M., Bernardi, R., Menini, S., & Zamparelli, R. (2014). Semeval-2014 task 1: Evaluation of compositional distributional semantic models on full sentences through semantic relatedness and textual entailment. SemEval-2014.

20. Batet, M. and Sánchez, D. (2015). Ontology Selection for Semantic Similarity Assessment. ICAART 2015, At Lisbon, Portugal, Volume: 2 https://www.researchgate.net/publication/283877653

21. Liu, H., Wang, P. (2014). Assessing Text Semantic Similarity Using Ontology. Journal Of Software, vol. 9, no. 2, pp.490-497.

22. Maheswari, J.U., Karpagam, G.R., Indhumathy, S. (2014). Comparison of Web Service Similarity- Assessment Methods. International Journal of Computer Applications (0975 - 8887) Volume 98 - No.22.

23. Moen, H. (2016). Distributional Semantic Models for Clinical Text Applied to Health Record Summarization Thesis for the Degree of Philosophiae Doctor Trondheim, May NTNU (Norwegian University of Science and Technology Faculty of Information Technology), 93 р.

24. Guessoum, D., Miraoui, M., Tadj, C. (2015). Survey Of Semantic Similarity Measures In Pervasive Computing. International Journal On Smart Sensing And Intelligent Systems Vol. 8, no. 1, рр.125-158.

25. Arora, S., Liang, Y., and Ma, T. (2017). A simple but tough-to-beat baseline for sentence embeddings. In Proceedings of ICLR 2017. https://openreview.net/pdf?id=SyK00v5xx.

26. Conneau, A., Kiela, D., Schwenk, H., Barrault, L., and Bordes, A. (2017). Supervised learning of universal sentence representations from natural language inference data. CoRR abs/1705.02364. http://arxiv.org/abs/1705.02364.

27. Pagliardini, M., Gupta, P., and Jaggi, M. (2017). Unsupervised Learning of Sentence Embeddings using Compositional n-Gram Features. arXiv https://arxiv.org/pdf/1703.02507.pdf.

28. Ferrero, J., Besacier, L., Schwab, D., and Agnes, F. (2017). Using Word Embedding for Cross-Language Plagiarism Detection. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, (EACL 2017). Association for Computational Linguistics, Valencia, Spain, volume 2, pages 415-421. http://aclweb.org/anthology/E/E17/E17-2066.pdf.

29. Camacho-Collados, J. and Navigli, R. (2016). Find the word that does not belong: A framework for an intrinsic evaluation of word vector representations. In Proceedings of the ACL Workshop on Evaluating Vector Space Representations for NLP. Berlin, Germany, pages 43-50.

30. Camacho-Collados, J., Taher Pilehvar, M., Collier, N., and Navigli, R. (2017). SemEval-2017 Task 2: Multilingual and cross-lingual semantic word similarity. In Proceedings of SemEval. Vancouver, Canada.

31. Mikolov, T., Chen, K., Corrado, G., and Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781

32. Van Eck, N.J., and Waltman, L. How to normalize cooccurrence data? An analysis of some well-known similarity measures. 2009. Journal of the American Society for Information Science and Technology, 60(8), 1635-1651.

Отправить рукопись Скачать PDF
Текст

Цитировать

Цитирований:

Подтверждение

Регистрация