NLP Tool for Extracting Relevant Information from Criminal Reports or Fakes/Propaganda Content

Autor(en): Vysotska, Victoria
Mazepa, Svitlana
Chyrun, Lyubomyr
Brodyak, Oksana
Shakleina, Iryna
Schuchmann, Vadim
Stichwörter: Character recognition; Crime; fake; Fake detection; forecasting; logistic regression; Logistics regressions; machine learning; Machine-learning; Natural language processing systems; NLP; NLTK; Porter stemmer; Porter's stemmer; propaganda; SpaCy; Support vector machines; SVM; text analysis; Text processing; text recognition; TF-IDF; TfidfVectorizer
Erscheinungsdatum: 2022
Herausgeber: Institute of Electrical and Electronics Engineers Inc.
Enthalten in: International Scientific and Technical Conference on Computer Sciences and Information Technologies
Band: 2022-November
Startseite: 93 – 98
Zusammenfassung: 
The goal of paper is to develop a natural language processing system to extract relevant information from criminal reports or fake/propaganda. A detailed description of the developed system was made. The technical requirements were formed. System development is divided into three points: data preparation, word embedding, and classification. Structure of the product under development have been described. It used libraries and ways of implementing the used methods. The structure of the neural network, initial parameters of the network are described in detail based on knowledge rather than machine learning methods: this means that it is necessary to specify patterns, and not to have a purpose to learn and generalize them. The second aim of this project was to apply the text summarization to case reports; that is, the converting text from long, detailed reports into a coherent, natural-language summary. The article describes the implemented training of the model for identifying propaganda using keywords. That is, propaganda carries a potential fake, and since informative noise is generally dangerous and threatens the sober thinking of a person, we suggest analysing the news for propaganda. In general, there is a lot of propaganda in the world, but many more fakes do not even carry propaganda, which improves the accuracy of model training. In this way, it allows you to filter out both fakes and propaganda. © 2022 IEEE.
Beschreibung: 
Cited by: 2; Conference name: 17th IEEE International Conference on Computer Science and Information Technologies, CSIT 2022; Conference date: 10 November 2022 through 12 November 2022; Conference code: 185883
ISBN: 9798350334319
ISSN: 2766-3655
DOI: 10.1109/CSIT56902.2022.10000563
Externe URL: https://www.scopus.com/inward/record.uri?eid=2-s2.0-85146328359&doi=10.1109%2fCSIT56902.2022.10000563&partnerID=40&md5=025e9b56ce577adcb03a67be45f6f39a

Zur Langanzeige

Seitenaufrufe

5
Letzte Woche
0
Letzter Monat
1
geprüft am 07.06.2024

Google ScholarTM

Prüfen

Altmetric