Fuzzy Modeling for Multi-Label Text Classification Supported by Classification Algorithms

Beatriz Wilges; Gustavo Mateus; Silvia Nassar; Renato Cislaghi; Rogério Cid Bastos

doi:10.3844/jcssp.2016.341.349

Research Article Open Access

Fuzzy Modeling for Multi-Label Text Classification Supported by Classification Algorithms

Beatriz Wilges¹, Gustavo Mateus¹, Silvia Nassar¹, Renato Cislaghi¹ and Rogério Cid Bastos¹

¹ Federal University of Santa Catarina, Brazil

Abstract

The ever-increasing amount of information on the Web is organized in structured, semi-structured and unstructured data. Text classification systems, capable of handling such different structures, may facilitate the work of important tasks such as indexation and information retrieval in search engines. The objective of this research is to develop a method for the classification of documents into multiple categories with fuzzy logic. This method was built from a process of pattern recognition and, also, two variables called similarity and accuracy were used. The proposed fuzzy classification method uses variables that express the ability to analyze the similarity and accuracy of a document through a database of terms. The database of terms is generated by a collection of pre-classified documents in categories of interest. The documents processed according to the similarity and accuracy in the database of terms composes a training set also called knowledge base. From this database, it is possible to identify a pattern that specifies a set of rules through a knowledge discovery process. This process involves the data mining of the knowledge base. Thus, it was possible to define a general model that is used in the creation of rules and membership functions of the fuzzy model for the classification of documents into multiple categories. The general model of the rules identified in the data mining process and implemented in fuzzy model considers the most significant variables and also contributes to the specification of the membership functions, such as the definition of linguistic terms of fuzzy sets. Thus, it was possible to implement a more deterministic approach regarding the input, membership functions and inference rules of the fuzzy model. The results of the proposed method for classification of documents are relevant because they have a satisfactory accuracy rate.

Journal of Computer Science

Volume 12 No. 7, 2016, 341-349

DOI: https://doi.org/10.3844/jcssp.2016.341.349

Submitted On: 5 July 2016 Published On: 22 August 2016

How to Cite: Wilges, B., Mateus, G., Nassar, S., Cislaghi, R. & Bastos, R. C. (2016). Fuzzy Modeling for Multi-Label Text Classification Supported by Classification Algorithms. Journal of Computer Science, 12(7), 341-349. https://doi.org/10.3844/jcssp.2016.341.349

Copyright: © 2016 Beatriz Wilges, Gustavo Mateus, Silvia Nassar, Renato Cislaghi and Rogério Cid Bastos. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

6,287 Views
4,531 Downloads
4 Citations

Download

Keywords

Text Categorization
Decision Tree
Fuzzy Modeling