RESEARCH AND DEVELOPMENT OF LINGUO-STATISTICAL METHODS FOR FORMING A PORTRAIT OF A SUBJECT AREA
Abstract and keywords
Abstract (English):
The project aims to solve the fundamental scientific problem of semantic modeling, within the framework of which a methodology is developed for the automated identification of translation links (translation correspondences), as well as hierarchical, synonymous and associative links from Internet texts and the construction of multilingual associative hierarchical portraits of subject area (MAHPSA), in particular, on autonomous uninhabited underwater vehicles (UUV). Accounting for multilingual and heterogeneous resources allows you to get a more complete picture of what is happening in the subject area, to identify the sources of the origin of ideas, the speed and directions of their distribution, to identify significant documents and promising directions. The solution to the problem is based on an integrated approach that combines the methods of statistics, corpus linguistics and distributive semantics, and is implemented in technology that involves the development of linguo-statistical mechanisms for the formation of a multilingual associative hierarchical portrait of a subject area, which is a dictionary of significant terms of the subject area, the elements of which organized in synonymous series (synsets), including translational correspondences, as well as associative and hierarchical relationships.

Keywords:
Linguo-statistical methods, associative and hierarchical portrait of the subject area, multilingual integrated ontology, forecasting the spread of ideas, multilingual body of the subject area
Text
Publication text (PDF): Read Download
References

1. J. Galbraith, and R. Thayer, SECSH Public Key File Format, draft-ietf-secsh-publickeyfile-01.txt, March 2001, work in progress material.

2. Zolotarev O.V., Sharnin M.M., Klimenko S.V., Kuznetsov K.I. PullEnty system - information extraction from natural language texts and automated building of information systems. In the collection: Situational centers and information-analytical systems of class 4i for monitoring and security tasks (SCVRT2015-16). Proceedings of the International Scientific Conference: in 2 volumes. 2016. P. 28-35.

3. Zolotarev O.V., Kozerenko E.B., Sharnin M.M. The principles of constructing models of business processes in the subject area based on natural language text processing. Bulletin of the Russian New University. Series: Complex systems: models, analysis and control. 2014. No. 4. P. 82-88.

4. Zolotarev O.V. Methods and tools for domain modeling. In the collection: The Civilization of Knowledge: Problems and Prospects of Social Communications Proceedings of the XIII International Scientific Conference. 2012. P. 71-72.

5. Zolotareva V.P., Yashkova N.V., Zolotarev O.V. Project management. Educational-methodical manual / Nizhny Novgorod, 2016.

6. Zolotarev O.V. Formalization of knowledge about the subject area based on the analysis of natural language structures. In the collection: The civilization of knowledge: the problem of man in science of the XXI century. Proceedings of the XII International Scientific Conference. 2011. P. 78-80.

7. Zolotarev O.V., Sharnin M.M. Methods of extracting knowledge from natural language texts and building business process models based on the allocation of processes, objects, their relationships and characteristics. In the collection: Proceedings of the International Scientific Conference CPT2014. Institute of Computing for Physics and Technology. 2015.P. 92-98.

8. Sharnin M.M., Zolotarev O.V., Somin N.V. Extracting and processing knowledge from unstructured texts of the business sphere and social networks. In the collection: Social computing: fundamentals, development technologies, social and humanitarian effects Materials of the Fourth International Scientific and Practical Conference. 2015. P. 364-371.

9. Zolotarev O.V., Kozerenko E.B., Sharnin M.M. Analytical intelligence based on the analysis of unstructured information from various sources, including the Internet and the media. Bulletin of the Russian New University. Series: Complex systems: models, analysis and control. 2015. No 1. P. 49-54.

10. Zolotarev O.V. New approaches in constructing the functional structure of the subject area. In the collection: Twenty Years of Post-Soviet Russia: crisis phenomena and modernization mechanisms materials of the XIV All-Russian Scientific and Practical Conference of the Humanitarian University: in 2 volumes. Humanitarian University. Ekaterinburg, 2011. P. 639-643.

11. Zolotarev O.V., Sharnin M.M., Klimenko S.V. A semantic approach to the analysis of terrorist activity on the Internet based on thematic modeling methods.

12. Zolotarev O.V., Sharnin M.M., Klimenko S.V. Bulletin of the Russian New University. Series: Complex systems: models, analysis and control. 2016. No. 3. P. 64-71.

13. Kozerenko E. B., Kuznetsov K. I. Romanov D. A. Semantic processing of unstructured textual data based on the linguistic processor PullEnti Informatics and applications 2018 volume 12 issue 3. DOI:https://doi.org/10.14357/19922264180313, pp. 91-98

14. Chiu, J.P. and Nichols, E. (2015). Named entity recognition with bidirectional lstm-cnns. arXiv preprint arXiv:1511.08308.

15. Peters M. E. et al. Deep contextualized word representations //arXiv preprint arXiv:1802.05365. - 2018.

16. Roberto Navigli and Simone Paolo Ponzetto. 2012a. BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic net-work.Artificial Intelligence, 193:217-250.

17. John Hebeler, Matthew Fisher, Ryan Blace, Andrew Perez-Lopez. Semantic Web Programming. - John Wiley & Sons, 2009. - 648 s.

18. V.I.Protasov, Z.E.Potapova, R.O.Mirakhmedov, M.M. Sharnin, Minasyan V.B. Methods for finding solutions by a group actor with a low probability of error. In the collection of CPT2019. Materials of the international scientific conference of the Nizhny Novgorod State University of Architecture and Civil Engineering and the Scientific and Research Center for Information in Physics and Technique. 2019, Nizhny Novgorod. P. 284-291.

19. Brickley D., Guha R.V. RDF vocabulary description language 1.0: RDF schema W3C working draft. 2002. http://www.w3.org/TR/2002/WD-rdf-schema-20020430/.

20. Ehrmann M., Cecconi F., Vannella D., McCrae J.P., Cimiano P., Navigli R. Representing Multilingual Data as Linked Data: the Case of BabelNet 2.0. - LREC (2014). - 2014. - URL: http://wwwusers.di.uniroma1.it/~navigli/pubs/ LREC_2014_Ehrmannetal.pdf.

21. T. Flati, D. Vannella, T. Pasini, R. Navigli. Two Is Bigger (and Better) Than One: the Wikipedia Bitaxonomy Project. Proc. of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014), Baltimore, USA, June 22-27, 2014, pp. 945-955.

22. Ustalov, D., & Panchenko, A. (2017). A tool for effective extraction of synsets and semantic relations from BabelNet. V Proceedings - 2017 Siberian Symposium on Data Science and Engineering, SSDSE 2017 (str. 10-13). [8071954] Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1109/SSDSE.2017.8071954

23. R. Navigli, S.P. Ponzetto, BabelNetXplorer: a platform for multilingual lexical knowledge base access and exploration, in: Companion Volume totheProceedings of the 21st World Wide Web Conference, Lyon, France, 16-20 April 2012, pp. 393-396.

24. Lau J.H., Newman D., Karimi S., Baldwin T. Best Topic Word Selection for Topic Labelling // COLING’10 Proceedings of the 23rd International Conference on Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, 2010. Pp. 605-613.

25. Google Cloud Machine Learning [CD] - https://cloud.google.com/ml-engine/docs/tutorials/python-guide.

26. Xie Pengtao, Xing Eric P. Integrating document clustering and topic modeling. arXiv preprint, arXiv:1309.6874. 2013.

Login or Create
* Forgot password?