NLP Tool for Extracting Relevant Information from Criminal Reports or Fakes/Propaganda Content
Autor(en): | Vysotska, Victoria Mazepa, Svitlana Chyrun, Lyubomyr Brodyak, Oksana Shakleina, Iryna Schuchmann, Vadim |
Stichwörter: | Character recognition; Crime; fake; Fake detection; forecasting; logistic regression; Logistics regressions; machine learning; Machine-learning; Natural language processing systems; NLP; NLTK; Porter stemmer; Porter's stemmer; propaganda; SpaCy; Support vector machines; SVM; text analysis; Text processing; text recognition; TF-IDF; TfidfVectorizer | Erscheinungsdatum: | 2022 | Herausgeber: | Institute of Electrical and Electronics Engineers Inc. | Enthalten in: | International Scientific and Technical Conference on Computer Sciences and Information Technologies | Band: | 2022-November | Startseite: | 93 – 98 | Zusammenfassung: | The goal of paper is to develop a natural language processing system to extract relevant information from criminal reports or fake/propaganda. A detailed description of the developed system was made. The technical requirements were formed. System development is divided into three points: data preparation, word embedding, and classification. Structure of the product under development have been described. It used libraries and ways of implementing the used methods. The structure of the neural network, initial parameters of the network are described in detail based on knowledge rather than machine learning methods: this means that it is necessary to specify patterns, and not to have a purpose to learn and generalize them. The second aim of this project was to apply the text summarization to case reports; that is, the converting text from long, detailed reports into a coherent, natural-language summary. The article describes the implemented training of the model for identifying propaganda using keywords. That is, propaganda carries a potential fake, and since informative noise is generally dangerous and threatens the sober thinking of a person, we suggest analysing the news for propaganda. In general, there is a lot of propaganda in the world, but many more fakes do not even carry propaganda, which improves the accuracy of model training. In this way, it allows you to filter out both fakes and propaganda. © 2022 IEEE. |
Beschreibung: | Cited by: 2; Conference name: 17th IEEE International Conference on Computer Science and Information Technologies, CSIT 2022; Conference date: 10 November 2022 through 12 November 2022; Conference code: 185883 |
ISBN: | 9798350334319 | ISSN: | 2766-3655 | DOI: | 10.1109/CSIT56902.2022.10000563 | Externe URL: | https://www.scopus.com/inward/record.uri?eid=2-s2.0-85146328359&doi=10.1109%2fCSIT56902.2022.10000563&partnerID=40&md5=025e9b56ce577adcb03a67be45f6f39a |
Zur Langanzeige
Seitenaufrufe
5
Letzte Woche
0
0
Letzter Monat
1
1
geprüft am 07.06.2024